Bad numbers in the “gzip beats BERT” paper?

In the recent paper titled “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors by Jiang et al., there seems to be a bug or unexpected choice in their kNN code, leading to higher accuracy numbers than expected. The paper’s Table 5, which was widely shared on Twitter, shows the gzip method outperforming other neural-network-based methods. However, upon closer examination, the results are different when the bug is corrected, with the gzip method performing poorly in some cases. The method uses kNN classification with k=2, which is an unusual choice. The code also handles tie-breaking in a surprising way, marking any tied label as correct. The paper’s calculations could be considered top-2 accuracy, where it is marked correct if either of the top two choices is correct. This method only applies this tie-breaking strategy for k=2 and has a random flag to break ties, but it appears that this flag was not used for the paper results. The author created a simple implementation with different tie-breaking strategies and found slight differences in the results compared to the paper.

https://kenschutte.com/gzip-knn-paper/