Abstract | ||
---|---|---|
In order to scale economically, data centers are increasingly evolving their data storage methods from the use of simple data replication to the use of more powerful erasure codes, which provide the same level of reliability as replication but at a significantly lower storage cost. In particular, it is well known that Maximum-Distance-Separable (MDS) codes, such as Reed-Solomon codes, provide the maximum storage efficiency. While the use of codes for providing improved reliability in archival storage systems, where data is less frequently accessed (or so-called “cold data”), is well understood, the role of codes in the storage of more frequently accessed and active “hot data”, where latency is the key metric, is less clear. In this paper, we study data storage systems based on MDS codes through the lens of queueing theory, and term the queueing system arising under codes as an “MDS queue.” We present insightful scheduling policies that form upper and lower bounds to its performance, and use these to obtain easily computable analytical bounds on the average latency of the MDS queue. These bounds were observed to be quite tight in the settings we simulated. We additionally derive closed-form expressions of the throughputs of these systems. Finally, we employ the framework of the MDS queue to analyse different methods of performing so-called degraded reads (reading of partial data) in distributed data storage. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/ISIT.2014.6874955 | Information Theory |
Keywords | Field | DocType |
Reed-Solomon codes,computer centres,data handling,queueing theory,telecommunication network reliability,MDS codes,MDS queue,Reed-Solomon codes,archival storage systems,closed-form expressions,data centers,data storage methods,data storage systems,distributed data storage,erasure codes,latency performance analysis,maximum distance separable codes,queueing system,reliability improvement | Mathematical optimization,Replication (computing),Computer science,Computer data storage,Scheduling (computing),Queue,Distributed data store,Theoretical computer science,Storage efficiency,Queueing theory,Erasure code,Distributed computing | Conference |
Citations | PageRank | References |
32 | 1.19 | 19 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nihar B. Shah | 1 | 1202 | 77.17 |
Kangwook Lee | 2 | 32 | 1.19 |
Kannan Ramchandran | 3 | 9401 | 1029.57 |