Title
Workload analysis of a large-scale key-value store
Abstract
Key-value stores are a vital component in many scale-out enterprises, including social networks, online retail, and risk analysis. Accordingly, they are receiving increased attention from the research community in an effort to improve their performance, scalability, reliability, cost, and power consumption. To be effective, such efforts require a detailed understanding of realistic key-value workloads. And yet little is known about these workloads outside of the companies that operate them. This paper aims to address this gap. To this end, we have collected detailed traces from Facebook's Memcached deployment, arguably the world's largest. The traces capture over 284 billion requests from five different Memcached use cases over several days. We analyze the workloads from multiple angles, including: request composition, size, and rate; cache efficacy; temporal patterns; and application use cases. We also propose a simple model of the most representative trace to enable the generation of more realistic synthetic workloads by the community. Our analysis details many characteristics of the caching workload. It also reveals a number of surprises: a GET/SET ratio of 30:1 that is higher than assumed in the literature; some applications of Memcached behave more like persistent storage than a cache; strong locality metrics, such as keys accessed many millions of times a day, do not always suffice for a high hit rate; and there is still room for efficiency and hit rate improvements in Memcached's implementation. Toward the last point, we make several suggestions that address the exposed deficiencies.
Year
DOI
Venue
2012
10.1145/2254756.2254766
SIGMETRICS
Keywords
Field
DocType
workload analysis,analysis detail,large-scale key-value store,hit rate improvement,realistic key-value workloads,workloads outside,realistic synthetic workloads,memcached deployment,cache efficacy,application use case,high hit rate,different memcached use case,social network,key value store,distributed databases,use case,risk analysis
Hit rate,Locality,Software deployment,Use case,Workload,Risk analysis (business),Cache,Computer science,Real-time computing,Distributed computing,Scalability
Conference
Volume
Issue
ISSN
40
1
0163-5999
Citations 
PageRank 
References 
168
6.72
23
Authors
5
Search Limit
100168
Name
Order
Citations
PageRank
Berk Atikoglu11928.24
Yuehai Xu221710.91
Eitan Frachtenberg3106085.08
Song Jiang4117851.38
Mike Paleczny526211.38