Title
Real-Time Snapshot Maintenance with Incremental ETL Pipelines in Data Warehouses
Abstract
Multi-version concurrency control method has nowadays been widely used in data warehouses to provide OLAP queries and ETL maintenance flows with concurrent access. A snapshot is taken on existing warehouse tables to answer a certain query independently of concurrent updates. In this work, we extend this snapshot with the deltas which reside at the source side of ETL flows. Before answering a query, relevant tables are first refreshed with the exact source deltas which are captured at the time this query arrives (so-called query-driven policy). Snapshot maintenance is done by an incremental recomputation pipeline which is flushed by a set of consecutive deltas belonging to a sequence of incoming queries. A workload scheduler is thereby used to achieve a serializable schedule of concurrent maintenance tasks and OLAP queries. Performance has been examined by using read-/update-heavy workloads.
Year
DOI
Venue
2015
10.1007/978-3-319-22729-0_17
Lecture Notes in Computer Science
Field
DocType
Volume
Data warehouse,Data mining,Pipeline transport,Serialization,Concurrency control,Computer science,Workload,Online analytical processing,Snapshot (computer storage),Database
Conference
9263
ISSN
Citations 
PageRank 
0302-9743
2
0.37
References 
Authors
5
4
Name
Order
Citations
PageRank
Weiping Qu163.15
Vinanthi Basavaraj220.37
Sahana Shankar340.73
stefan dessloch49111.69