The authors present phi-1, a new large language model designed for code, with a significantly smaller size compared to other competing models. Phi-1 is Transformer-based with 1.3B parameters, trained for 4 days on 8 A100s, using a combination of web data and synthetically generated textbooks and exercises with GPT-3.5. Despite its comparatively small scale, phi-1 achieves pass@1 accuracy of 50.6% on HumanEval and 55.5% on MBPP. Moreover, the model exhibits surprising emergent properties compared to phi-1-base and phi-1-small.
https://arxiv.org/abs/2306.11644