Title
Mining hidden mixture context with ADIOS-P to improve predictive pre-fetcher accuracy
Abstract
Predictive pre-fetcher, which predicts future data access events and loads the data before users requests, has been widely studied, especially in file systems or web contents servers, to reduce data load latency. Especially in scientific data visualization, pre-fetching can reduce the IO waiting time. In order to increase the accuracy, we apply a data mining technique to extract hidden information. More specifically, we apply a data mining technique for discovering the hidden contexts in data access patterns and make prediction based on the inferred context to boost the accuracy. In particular, we performed Probabilistic Latent Semantic Analysis (PLSA), a mixture model based algorithm popular in the text mining area, to mine hidden contexts from the collected user access patterns and, then, we run a predictor within the discovered context. We further improve PLSA by applying the Deterministic Annealing (DA) method to overcome the local optimum problem. In this paper we demonstrate how we can apply PLSA and DA optimization to mine hidden contexts from users data access patterns and improve predictive pre-fetcher performance.
Year
DOI
Venue
2012
10.1109/eScience.2012.6404418
eScience
Keywords
Field
DocType
mixture model based algorithm,local optimum problem,deterministic annealing method,pre-fetching,predictive pre-fetcher accuracy,prefetch,storage management,information retrieval,users data access pattern,hidden information extraction,data access events,future data access event,io waiting time,text mining area,hidden context,web contents servers,hidden information,hidden mixture context,probabilistic latent semantic analysis,data access pattern,hidden context mining,scientific data visualization,data load latency,data mining,data visualization,adios-p,user access pattern,data mining technique,text analysis,file systems,probability,da method
Data mining,Data visualization,Algorithm design,Local optimum,Computer science,Server,Probabilistic latent semantic analysis,Cluster analysis,Data access,Mixture model
Conference
ISBN
Citations 
PageRank 
978-1-4673-4467-8
2
0.38
References 
Authors
22
10
Name
Order
Citations
PageRank
Jong Youl Choi130926.54
Matthew Wolf257539.27
Manish Parashar33876343.30
Scott Klasky4154799.00
Judy Qiu574343.25
Norbert Podhorszki6104683.84
Geoffrey Fox74070575.38
Dave Pugmire815218.62
Hasan Abbasi966032.61
Cristian Capdevila1020.38