Title
aHDFS: An Erasure-Coded Data Archival System for Hadoop Clusters.
Abstract
In this paper, we propose an erasure-coded data archival system called aHDFS for Hadoop clusters, where RS(k + r; k) codes are employed to archive data replicas in the Hadoop distributed file system or HDFS. We develop two archival strategies (i.e., aHDFS-Grouping and aHDFS-Pipeline) in aHDFSto speed up the data archival process. aHDFS-Groupinga MapReduce-based data archiving scheme - keeps each m...
Year
DOI
Venue
2017
10.1109/TPDS.2017.2706686
IEEE Transactions on Parallel and Distributed Systems
Keywords
Field
DocType
Mathematical model,Distributed databases,Redundancy,Encoding,Programming,Pipelines,Data models
Block size,Distributed File System,Data modeling,Computer science,Parallel computing,Real-time computing,Shuffling,Distributed database,Erasure code,Encoding (memory),Distributed computing,Speedup
Journal
Volume
Issue
ISSN
28
11
1045-9219
Citations 
PageRank 
References 
3
0.50
21
Authors
5
Name
Order
Citations
PageRank
Yuanqi Chen163.92
Yi Zhou223032.97
Shubbhi Taneja351.87
Xiao Qin41836125.69
Jianzhong Huang58719.32