Title
Spyglass: fast, scalable metadata search for large-scale storage systems
Abstract
The scale of today's storage systems has made it increasingly difficult to find and manage files. To address this, we have developed Spyglass, a file metadata search system that is specially designed for large-scale storage systems. Using an optimized design, guided by an analysis of real-world metadata traces and a user study, Spyglass allows fast, complex searches over file metadata to help users and administrators better understand and manage their files. Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties. Flexible index control is provided by an index partitioning mechanism that leverages namespace locality. Signature files are used to significantly reduce a query's search space, improving performance and scalability. Snapshot-based metadata collection allows incremental crawling of only modified files. A novel index versioning mechanism provides both fast index updates and "back-in-time" search of metadata. An evaluation of our Spyglass prototype using our real-world, large-scale metadata traces shows search performance that is 1-4 orders of magnitude faster than existing solutions. The Spyglass index can quickly be updated and typically requires less than 0.1%of disk space. Additionally, metadata collection is up to 10× faster than existing approaches.
Year
Venue
Keywords
2009
FAST
large-scale storage system,large-scale metadata trace,metadata collection,file metadata,novel metadata search technique,snapshot-based metadata collection,complex search,search performance,real-world metadata trace,scalable metadata search,metadata search property,file metadata search system,storage system
Field
DocType
Citations 
Metadata,Metadata repository,Meta Data Services,Computer science,Data element,Namespace,Snapshot (computer storage),Database,Software versioning,Scalability
Conference
58
PageRank 
References 
Authors
2.21
36
5
Name
Order
Citations
PageRank
Andrew W. Leung129513.93
Minglong Shao21849.53
Timothy Bisson31728.04
Shankar Pasupathy457520.92
Ethan L. Miller52870281.96