Title
Scale and concurrency of GIGA+: file system directories with millions of files
Abstract
We examine the problem of scalable file system directories, motivated by data-intensive applications requiring millions to billions of small files to be ingested in a single directory at rates of hundreds of thousands of file creates every second. We introduce a POSIX-compliant scalable directory design, GIGA+, that distributes directory entries over a cluster of server nodes. For scalability, each server makes only local, independent decisions about migration for load balancing. GIGA+ uses two internal implementation tenets, asynchrony and eventual consistency, to: (1) partition an index among all servers without synchronization or serialization, and (2) gracefully tolerate stale index state at the clients. Applications, however, are provided traditional strong synchronous consistency semantics. We have built and demonstrated that the GIGA+ approach scales better than existing distributed directory implementations, delivers a sustained throughput of more than 98,000 file creates per second on a 32-server cluster, and balances load more efficiently than consistent hashing.
Year
Venue
Keywords
2011
FAST
balances load,load balancing,directory implementation,eventual consistency,posix-compliant scalable directory design,32-server cluster,single directory,scalable file system directory,directory entry,small file
Field
DocType
Citations 
Eventual consistency,Working directory,File system,Giga-,Directory,Computer science,Server,File system fragmentation,Operating system,Computer file
Conference
59
PageRank 
References 
Authors
1.70
36
2
Name
Order
Citations
PageRank
Swapnil Patil130618.05
Garth Gibson225713.77