Notes on OpenAI’s new o1 chain-of-thought models

On September 12, 2024, OpenAI introduced two new preview models, o1-preview and o1-mini, codenamed “strawberry.” These models focus on improved “reasoning” capabilities by honing their chain of thought through reinforcement learning. The documentation reveals that o1 models are ideal for applications requiring deep reasoning and longer response times, compared to GPT-4o models. A unique feature is the introduction of “reasoning tokens” which enhance the models’ ability to handle complex prompts. However, there is controversy surrounding the invisibility of these tokens in the API response. Examples show the models excelling in tasks that previously challenged other models, indicating their promising potential. The community awaits further exploration of these innovative models, expecting to expand the applications of large language models.

https://simonwillison.net/2024/Sep/12/openai-o1/