Title
Optimizing ETL by a Two-Level Data Staging Method.
Abstract
In data warehousing, the data from source systems are populated into a central data warehouse DW through extraction, transformation and loading ETL. The standard ETL approach usually uses sequential jobs to process the data with dependencies, such as dimension and fact data. It is a non-trivial task to process the so-called early-/late-arriving data, which arrive out of order. This paper proposes a two-level data staging area method to optimize ETL. The proposed method is an all-in-one solution that supports processing different types of data from operational systems, including early-/late-arriving data, and fast-/slowly-changing data. The introduced additional staging area decouples loading process from data extraction and transformation, which improves ETL flexibility and minimizes intervention to the data warehouse. This paper evaluates the proposed method empirically, which shows that it is more efficient and less intrusive than the standard ETL method.
Year
DOI
Venue
2016
10.4018/IJDWM.2016070103
IJDWM
Keywords
Field
DocType
Data Staging, Data Warehousing, ETL, Early-/Late-Arriving Data, Optimization
Data warehouse,Data mining,Computer science,Staging area,Data type,Data extraction,Out-of-order execution,Database
Journal
Volume
Issue
ISSN
12
3
1548-3924
Citations 
PageRank 
References 
0
0.34
14
Authors
4
Name
Order
Citations
PageRank
Xiufeng Liu110814.69
Nadeem Iftikhar28011.50
Huon Huo300.34
Per Sieverts Nielsen4263.83