The author discusses their interest in optimizing handling small strings using SIMD instructions for a fast hash function in Rust. They explore AVX-512-BW and ARM SVE instructions, focusing on tolower64() function for efficient string processing. They delve into implementing masked load and store operations for handling small string fragments quickly. Benchmarking results show the effectiveness of tolower64() compared to standard functions like memcpy and tolower(). The conclusion highlights the smooth performance of AVX-512-BW and the potential improvements in string handling with these instruction set extensions. The author expresses the desire for wider availability of these extensions to enhance string processing performance.
https://dotat.at/@/2024-07-28-tolower-avx512.html