LLMs use a surprisingly simple mechanism to retrieve some stored knowledge

Large language models like ChatGPT are complex tools used in customer support, code generation, and language translation. Despite their widespread use, scientists struggle to grasp their inner workings. Researchers at MIT found that these models often use a surprisingly simple linear function to retrieve stored facts. By identifying linear functions for different relations, researchers can probe models to identify stored knowledge. While not all information is stored linearly, understanding these mechanisms could help correct falsehoods and prevent AI chatbots from providing incorrect information. This research sheds light on how large language models recall factual knowledge, capturing both complex and straightforward mechanisms within these models.

https://news.mit.edu/2024/large-language-models-use-surprisingly-simple-mechanism-retrieve-stored-knowledge-0325

To top