Inverted Transformers Are Effective for Time Series Forecasting

The author discusses the recent popularity of linear forecasting models and questions the necessity of architectural modifications to Transformer-based forecasters. These forecasters use Transformers to model global dependencies in time series data. However, Transformers face challenges when forecasting series with larger lookback windows, resulting in performance degradation and computation explosion. Additionally, the unified embedding for each time token may struggle to learn meaningful representations and attention maps for variates with different timestamps and physical measurements. To address these issues, the author proposes iTransformer, a modified Transformer architecture that reverses the duties of the attention mechanism and feed-forward network. The iTransformer model achieves impressive results on various datasets and offers improvements in performance, generalization ability, and utilization of lookback windows.

https://arxiv.org/abs/2310.06625