Title
The Architectural Implications of Facebook's DNN-Based Personalized Recommendation
Abstract
The widespread application of deep learning has changed the landscape of computation in data centers. In particular, personalized recommendation for content ranking is now largely accomplished using deep neural networks. However, despite their importance and the amount of compute cycles they consume, relatively little research attention has been devoted to recommendation systems. To facilitate research and advance the understanding of these workloads, this paper presents a set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct in-depth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inference jobs can drastically improve latency-bounded throughput, and diversity across recommendation models leads to different optimization strategies.
Year
DOI
Venue
2020
10.1109/HPCA47549.2020.00047
2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Keywords
DocType
ISSN
data centers,personalized recommendation,content ranking,deep neural networks,recommendation systems,production-scale DNNs,open-source workloads,recommendation models,Facebook,deep learning,latency-bounded throughput
Conference
1530-0897
ISBN
Citations 
PageRank 
978-1-7281-6150-1
4
0.44
References 
Authors
24
16
Name
Order
Citations
PageRank
Udit Gupta1746.27
Carole-Jean Wu243223.81
Xiaodong Wang31266.24
Maxim Naumov46810.29
Brandon Reagen521013.90
David Brooks65518422.08
Bradford Cottel740.44
Kim M. Hazelwood82465110.46
Mark Hempstead998081.39
Bill Jia101265.90
Hsien-Hsin Sean Lee111657102.66
Andrey Malevich1240.44
Dheevatsa Mudigere1328919.84
Mikhail Smelyanskiy14116065.96
Liang Xiong1540.44
xuan zhang169325.30