Title
A framework for low-communication 1-D FFT
Abstract
In high-performance computing on distributed-memory systems, communication often represents a significant part of the overall execution time. The relative cost of communication will certainly continue to rise as compute-density growth follows the current technology and industry trends. Design of lower-communication alternatives to fundamental computational algorithms has become an important field of research. For distributed 1-D FFT, communication cost has hitherto remained high as all industry-standard implementations perform three all-to-all internode data exchanges (also called global transposes). These communication steps indeed dominate execution time. In this paper, we present a mathematical framework from which many single-all-to-all and easy-to-implement 1-D FFT algorithms can be derived. For large-scale problems, our implementation can be twice as fast as leading FFT libraries on state-of-the-art computer clusters. Moreover, our framework allows tradeoff between accuracy and performance, further boosting performance if reduced accuracy is acceptable.
Year
DOI
Venue
2013
10.1109/SC.2012.5
Scientific Programming - Selected Papers from Super Computing 2012
Keywords
DocType
Volume
distributed memory systems,electronic data interchange,fast fourier transforms,mathematical analysis,computational algorithm,compute density growth,distributed memory system,high performance computing,industry standard,internode data exchanges,low communication 1d fft,mathematical framework
Journal
21
Issue
ISSN
ISBN
3-4
2167-4329
978-1-4673-0805-2
Citations 
PageRank 
References 
10
0.75
15
Authors
4
Name
Order
Citations
PageRank
P. T. P. Tang111111.63
Jongsoo Park21039.49
Daehyun Kim3100.75
Vladimir Petrov4100.75