Title
Algorithms for Efficient Reproducible Floating Point Summation
Abstract
AbstractWe define “reproducibility” as getting bitwise identical results from multiple runs of the same program, perhaps with different hardware resources or other changes that should not affect the answer. Many users depend on reproducibility for debugging or correctness. However, dynamic scheduling of parallel computing resources, combined with nonassociative floating point addition, makes reproducibility challenging even for summation, or operations like the BLAS. We describe a “reproducible accumulator” data structure (the “binned number”) and associated algorithms to reproducibly sum binary floating point numbers, independent of summation order. We use a subset of the IEEE Floating Point Standard 754-2008 and bitwise operations on the standard representations in memory. Our approach requires only one read-only pass over the data, and one reduction in parallel, using a 6-word reproducible accumulator (more words can be used for higher accuracy), enabling standard tiling optimization techniques. Summing n words with a 6-word reproducible accumulator requires approximately 9n floating point operations (arithmetic, comparison, and absolute value) and approximately 3n bitwise operations. The final error bound with a 6-word reproducible accumulator and our default settings can be up to 229 times smaller than the error bound for conventional (recursive) summation on ill-conditioned double-precision inputs.
Year
DOI
Venue
2020
10.1145/3389360
ACM Transactions on Mathematical Software
Keywords
DocType
Volume
Reproducible summation, binned number, binned summation, floating point number, floating point summation, reproducibility, parallel, computer arithmetic, summation
Journal
46
Issue
ISSN
Citations 
3
0098-3500
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Peter Ahrens111.04
James Demmel24817551.47
Hong Diep Nguyen31388.93