Title
Fatman: cost-saving and reliable archival storage based on volunteer resources
Abstract
We present Fatman, an enterprise-scale archival storage based on volunteer contribution resources from underutilized web servers, usually deployed on thousands of nodes with spare storage capacity. Fatman is specifically designed for enhancing the utilization of existing storage resources and cutting down the hardware purchase cost. Two major concerned issues of the system design are maximizing the resource utilization of volunteer nodes without violating Service Level Objectives (SLOs) and minimizing the cost without reducing the availability of archival system. Fatman has been widely deployed on tens of thousands of server nodes across several datacenters, provided more than 100PB storage capacity and served dozens of internal mass-data applications. The system realizes an efficient storage quota consolidation by strong isolation and budget limitation, to maximally support resources contribution without any degradation on host-level SLOs. It firstly improves data reliability by applying disk failure prediction to minish failure recovery cost, named fault-aware data management, dramatically reduces the MTTR by 76.3% and decreases file crash ratio by 35% on real-life product workload.
Year
DOI
Venue
2014
10.14778/2733004.2733078
PVLDB
Field
DocType
Volume
Crash,Service level objective,Spare part,Workload,Computer science,Systems design,Archival storage,Data management,Database,Web server
Journal
7
Issue
ISSN
Citations 
13
2150-8097
2
PageRank 
References 
Authors
0.38
23
5
Name
Order
Citations
PageRank
An Qin1123.36
Dianming Hu2413.04
Jun Liu32929.34
Wenjun Yang4111.63
Dai Tan562.12