Title
Walnut: a unified cloud object store
Abstract
Walnut is an object-store being developed at Yahoo! with the goal of serving as a common low-level storage layer for a variety of cloud data management systems including Hadoop (a MapReduce system), MObStor (a multimedia serving system), and PNUTS (an extended key-value serving system). Thus, a key performance challenge is to meet the latency and throughput requirements of the wide range of workloads commonly observed across these diverse systems. The motivation for Walnut is to leverage a carefully optimized low-level storage system, with support for elasticity and high-availability, across all of Yahoo!'s data clouds. This would enable sharing of hardware resources across hitherto siloed clouds of different types, offering greater potential for intelligent load balancing and efficient elastic operation, and simplify the operational tasks related to data storage. In this paper, we discuss the motivation for unifying different storage clouds, describe the requirements of a common storage layer, and present the Walnut design, which uses a quorum-based replication protocol and one-hop direct client access to the data in most regular operations. A unique contribution of Walnut is its hybrid object strategy, which efficiently supports both small and large objects. We present experiments based on both synthetic and real data traces, showing that Walnut works well over a wide range of workloads, and can indeed serve as a common low-level storage layer across a range of cloud systems.
Year
DOI
Venue
2012
10.1145/2213836.2213947
SIGMOD Conference
Keywords
Field
DocType
common storage layer,different storage cloud,common low-level storage layer,real data trace,low-level storage system,unified cloud object store,data cloud,mapreduce system,data storage,wide range,cloud data management system,high availability,storage system,load balance
Data mining,Converged storage,Computer science,Load balancing (computing),Computer data storage,Latency (engineering),Information repository,Throughput,Database,Cloud storage,Cloud computing,Distributed computing
Conference
Citations 
PageRank 
References 
10
0.68
18
Authors
7
Name
Order
Citations
PageRank
Jianjun Chen1100.68
Chris Douglas266723.01
Michi Mutsuzaki3201.59
Patrick Quaid4100.68
Raghu Ramakrishnan5126492243.05
Sriram Rao644023.78
Russell Sears7179985.12