A year ago, rumors swirled about large language models excelling at chess. Despite not being designed for it, modern LLMs surprised many by playing through games with never-before-seen boards. Could this accidental chess prowess hint at LLMs’ abilities in other areas? Recent experiments have revealed that most LLMs struggle with chess, with only one model, gpt-3.5-turbo-instruct, showing excellent gameplay. The results have led to theories about why certain LLMs perform better, with potential implications for how they process different types of data. Despite challenges with prompts and tokenizers, the fascinating journey of LLMs attempting chess continues to unfold, sparking curiosity and discovery. Check out more details at dynomight.net/chess.
https://dynomight.substack.com/p/chess