The author introduces the RaBitQ approximate nearest neighbor search algorithm, initial implementation in C++, and subsequent slow rewrite in Rust. They detail step-by-step improvements, including dataset preparation, profiling tools like Samply and Cargo, and benchmarking with Criterion. Surprisingly, they delve into CPU-specific optimizations, SIMD implementation for performance gains, and leveraging Rust algebra crates like faer. Key optimizations include AVX2 for binary dot product and manual scalar quantization with dramatic QPS improvements for GIST dataset. They caution against excessive metrics/logging/traces for performance. The author concludes with lessons on proper SIMD usage, importance of IO, and wise library selection.
https://blog.mapotofu.org/blogs/rabitq-bench/