FFT Convolutions are Faster than Winograd on Modern CPUs, Here is Why. - Citegraph

Paper Info

Title
FFT Convolutions are Faster than Winograd on Modern CPUs, Here is Why.

Abstract
Winograd-based convolution has quickly gained traction as a preferred approach to implement convolutional neural networks (ConvNet) on various hardware platforms because it requires fewer floating point operations than FFT-based or direct convolutions. This paper compares three highly optimized implementations (regular FFT--, Gauss--FFT--, and Winograd--based convolutions) on modern multi-- and many--core CPUs. Although all three implementations employed the same optimizations for modern CPUs, our experimental results with two popular ConvNets (VGG and AlexNet) show that the FFT--based implementations generally outperform the Winograd--based approach, contrary to the popular belief. To understand the results, we use a Roofline performance model to analyze the three implementations in detail, by looking at each of their computation phases and by considering not only the number of floating point operations, but also the memory bandwidth and the cache sizes. The performance analysis explains why, and under what conditions, the FFT--based implementations outperform the Winograd--based one, on modern CPUs.

Year	Venue	Field
2018	arXiv: Performance	Memory bandwidth,Cache,Convolutional neural network,Floating point,Convolution,Computer science,Parallel computing,Implementation,Fast Fourier transform,Computation
DocType	Volume	Citations
Journal	abs/1809.07851	0
PageRank	References	Authors
0.34	10	4

Authors (4 rows)

Cited by (0 rows)

References (10 rows)

Name	Order	Citations	PageRank
aleksandar zlateski	1	39	5.65
Zhen Jia	2	338	17.82
Kai Li	3	5345	435.41
Frédo Durand	4	8625	414.94

1