Title
An Efficient Data Selection Policy for Search Engine Cache Management
Abstract
Caching is an effective optimization in search engine. The data selection policy plays a key role in caching, which places the data to be cached in memory. However, the current data selection policies are not suitable to the hybrid storage architecture with solid state disks (SSDs), which have gradually replaced hard disk drives (HDDs) in search engines. In this paper, we present an Efficient Data Selection policy (EDS) for search engine cache management, which views cache media as a knapsack, and views results and posting lists as items. The best benefit can be computed by greedy algorithms. In order to verify the effectiveness, we carry out a series of experiments to study essential factors of data selection in different architectures, including HDD, SSD, and SSD-based hybrid storage architecture, which uses SSD as a secondary cache for memory. The experimental results demonstrate that the proposed policy improves the hit ratio by 20.04% and the retrieval performance on HDD, SSD, and hybrid architecture by 31.98%, 28.72% and 23.24%, respectively.
Year
DOI
Venue
2015
10.1109/HPCC-CSS-ICESS.2015.216
HPCC/CSS/ICESS
Keywords
Field
DocType
search engine, cache management, data selection, solid state disk, hybrid storage architecture
Cache-oblivious algorithm,Cache invalidation,Cache pollution,Cache,Computer science,Cache algorithms,Page cache,Real-time computing,Cache coloring,Smart Cache,Operating system,Database
Conference
ISSN
Citations 
PageRank 
2576-3504
0
0.34
References 
Authors
14
7
Name
Order
Citations
PageRank
Xinhua Dong1131.93
Ruixuan Li240569.47
Heng He3243.47
Xiwu Gu45914.39
Mudar Sarem57315.90
Meikang Qiu63722246.98
Keqin Li72778242.13