The FFT Strikes Back: An Efficient Alternative to Self-Attention

In this study, we introduce FFTNet, a novel spectral filtering framework that utilizes Fast Fourier Transform (FFT) to achieve global token mixing in a more efficient manner compared to conventional self-attention mechanisms. By leveraging the frequency domain, FFTNet efficiently captures long-range dependencies while dynamically emphasizing salient frequency components through a learnable spectral filter and modReLU activation. Experimental results on Long Range Arena and ImageNet benchmarks validate the effectiveness of FFTNet over fixed Fourier and standard attention models, showcasing its superior performance. This approach provides a unique and adaptive alternative to traditional self-attention mechanisms, addressing scalability issues on long sequences.

https://arxiv.org/abs/2502.18394

To top