Automatically Detecting Under-Trained Tokens in Large Language Models

In this study submitted on 8 May 2024, the authors address the problem of “glitch tokens” in language models that can cause unwanted behavior due to discrepancies between tokenizer creation and model training. They focus on identifying these tokens through a detailed analysis of Large Language Model (LLM) tokenizers, utilizing various techniques such as tokenizer analysis, model weight-based indicators, and prompting. Their research reveals the prevalence of untrained and under-trained tokens in various models, providing valuable insights for improving the performance and safety of language models. This study sheds light on a crucial issue in the field, offering effective methods for automatically detecting and addressing problematic tokens.

https://arxiv.org/abs/2405.05417