Title
SP-TSRM: A Data Grouping Strategy in Distributed Storage System.
Abstract
With the development of smart devices and social media, massive unstructured data is uploaded to distributed storage systems. Since the characteristics of multi-users and high concurrency the unstructured data accesses have, it brings new challenges to traditional distributed storage systems designed for large files. We propose a grouping strategy to analyze relevant data in access according to disk access logs in the real distributed storage systems environment. When any data in the group is accessed, the whole group is prefetched from disk to the cache. Firstly, we conduct statistical analysis on the access logs and propose a preliminary classification method to classify files in spatiotemporal locality. Secondly, a strength-priority tree structure relation model (SP-TSRM) is proposed to mine file group efficiently. Finally, experiments show that the proposed model can improve the cache hit rate significantly, thereby improving the read efficiency of distributed storage systems.
Year
Venue
Field
2018
ICA3PP
Locality,Social media,Computer science,Concurrency,Cache,Distributed data store,Upload,Unstructured data,Tree structure,Distributed computing
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
9
5
Name
Order
Citations
PageRank
Dongjie Zhu144.77
Haiwen Du201.01
Ning Cao31821.45
Xueming Qiao401.01
Yanyan Liu5389.19