Title
Faster Population Counts using AVX2 Instructions.
Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g. popcnt on x64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g. the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (Clang).
Year
DOI
Venue
2018
10.1093/comjnl/bxx046
COMPUTER JOURNAL
Keywords
DocType
Volume
software performance,SIMD instructions,vectorization,bitset,Jaccard index
Journal
61
Issue
ISSN
Citations 
1
0010-4620
1
PageRank 
References 
Authors
0.36
0
3
Name
Order
Citations
PageRank
Wojciech Mula110.36
Nathan Kurz2312.45
Daniel Lemire3347.76