Algorithms for Efficient Reproducible Floating Point Summation

Abstract

We define “reproducibility” as getting bitwise identical results from multiple runs of the same program, perhaps with different hardware resources or other changes that should not affect the answer. Many users depend on reproducibility for debugging or correctness. However, dynamic scheduling of parallel computing resources, combined with nonassociative floating point addition, makes reproducibility challenging even for summation, or operations like the BLAS. We describe a “reproducible accumulator” data structure (the “binned number”) and associated algorithms to reproducibly sum binary floating point numbers, independent of summation order. We use a subset of the IEEE Floating Point Standard 754-2008 and bitwise operations on the standard representations in memory. Our approach requires only one read-only pass over the data, and one reduction in parallel, using a 6-word reproducible accumulator (more words can be used for higher accuracy), enabling standard tiling optimization techniques. Summing n words with a 6-word reproducible accumulator requires approximately 9 n floating point operations (arithmetic, comparison, and absolute value) and approximately 3 n bitwise operations. The final error bound with a 6-word reproducible accumulator and our default settings can be up to 2 29 times smaller than the error bound for conventional (recursive) summation on ill-conditioned double-precision inputs.

Document Details

Document Type
Pub Defense Publication
Publication Date
Jul 21, 2020
Source ID
10.1145/3389360

Entities

People

  • Hong Diep Nguyen
  • James Demmel
  • Willow Ahrens

Organizations

  • Cray
  • Defense Advanced Research Projects Agency
  • Google
  • Hp
  • Huawei
  • Intel Corporation
  • LG Electronics
  • Massachusetts Institute of Technology
  • MathWorks
  • National Science Foundation
  • Nokia
  • Nvidia
  • Oracle
  • Samsung Group
  • Saudi Aramco
  • United States Department of Energy
  • University of California, Berkeley

Tags

Readers

  • Integrated Circuit Design and Technology.
  • Linear Algebra
  • Oncology and Biomarker-Based Cancer Detection.