Σ-GPTs: A New Approach to Autoregressive Models

Autoregressive models like the GPT family typically use a fixed left-to-right order to generate sequences, but this paper challenges that norm. By adding a positional encoding for the output, the order can be adjusted per sample, offering advantages like sampling and conditioning on specific tokens. The method also allows for dynamic sampling of multiple tokens with a rejection strategy, reducing the number of model evaluations needed. The approach was tested in various domains, resulting in a significant decrease in the generation process. This innovative technique has the potential to revolutionize sequence generation tasks.

https://arxiv.org/abs/2404.09562

To top