Adaptive Caching in Big SQL using the HDFS Cache. - Citegraph

Paper Info

Title
Adaptive Caching in Big SQL using the HDFS Cache.

Abstract
The memory and storage hierarchy in database systems is currently undergoing a radical evolution in the context of Big Data systems. SQL-on-Hadoop systems share data with other applications in the Big Data ecosystem by storing their data in HDFS, using open file formats. However, they do not provide automatic caching mechanisms for storing data in memory. In this paper, we describe the architecture of IBM Big SQL and its use of the HDFS cache as an alternative to the traditional buffer pool, allowing in-memory data to be shared with other Big Data applications. We design novel adaptive caching algorithms for Big SQL tailored to the challenges of such an external cache scenario. Our experimental evaluation shows that only our adaptive algorithms perform well for diverse workload characteristics, and are able to adapt to evolving data access patterns. Finally, we discuss our experiences in addressing the new challenges imposed by external caching and summarize our insights about how to direct ongoing architectural evolution of external caching mechanisms.

Year	DOI	Venue
2016	10.1145/2987550.2987553	SoCC
Keywords	Field	DocType
SQL-on-Hadoop, HDFS Caching	SQL,File format,IBM,Cache,Computer science,Cache algorithms,Smart Cache,Big data,Data access,Database	Conference
Citations	PageRank	References
12	0.77	17
Authors
6

Authors (6 rows)

Cited by (12 rows)

References (17 rows)

Name	Order	Citations	PageRank
Avrilia Floratou	1	213	16.29
Nimrod Megiddo	2	4244	668.46
Navneet Potti	3	38	4.23
F. Ozcan	4	99	6.71
Uday Kale	5	12	0.77
Jan Schmitz-Hermes	6	12	0.77

1