Time is encoded in the weights of finetuned language models

The authors of this study introduce a tool called time vectors, which can be used to customize language models to specific time periods. By finetuning a language model on data from a single time period and subtracting the weights of the original pretrained model, a time vector is created. This time vector represents a direction in weight space that improves the model’s performance on text from that particular time period. Interestingly, time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. The authors show that by interpolating between time vectors, new models can be induced that perform better on intervening and future time periods, without any additional training. The consistent results across different tasks, domains, model sizes, and time scales suggest that time is encoded in the weight space of finetuned models. No controversial or surprising information is mentioned in the content.

https://arxiv.org/abs/2312.13401