Training LLMs to Reason in a Continuous Latent Space

Language models typically reason in the “language space” using a chain-of-thought approach to solve complex problems. However, this approach may not always be optimal for reasoning, as many word tokens are more for coherence than essential reasoning. To address this, Coconut (Chain of Continuous Thought) introduces a novel paradigm using the LLM’s hidden state as a continuous representation of reasoning, allowing for multiple next steps and a breadth-first search approach. Coconut outperforms traditional methods in logical reasoning tasks, offering valuable insights into latent reasoning for future research.

https://arxiv.org/abs/2412.06769