Taming floating-point sums

The author discusses the issue of potential large accumulated errors when summing floating-point numbers in an array and presents different methods to address this problem. They highlight the limitations and risks associated with naive approaches like the “naive_sum” function and propose more accurate methods such as pairwise summation, Kahan summation, and exact summation. The use of compiler intrinsics, specifically the “fadd_algebraic” operators, is recommended for efficient and accurate floating-point summation. Benchmark results show significant performance variations between different summation methods, with the “block_kahan_autovec” method identified as a Pareto-optimal choice for balancing accuracy and speed. The author also warns against using certain LLVM flags that could lead to undefined behavior when dealing with floating-point calculations.


To top