In recent research, smaller models have been enhanced through imitation learning, drawing on outputs generated by larger foundation models (LFMs). These smaller models often struggle with limited imitation signals and a lack of rigorous evaluation, leading to overestimation of their capabilities as they learn to imitate the style but not the reasoning process of LFMs. To address this, the authors have developed Orca, a 13-billion parameter model that learns to imitate the reasoning process of LFMs by tapping into diverse imitation data with sampling and selection. Orca outperforms conventional models on complex reasoning benchmarks and shows promising direction for improving model capabilities.
https://arxiv.org/abs/2306.02707