Title
xPAD: a platform for analytic data flows
Abstract
As enterprises become more automated, real-time, and data-driven, they need to integrate new data sources and specialized processing engines. The traditional business intelligence architecture of Extract-Transform-Load (ETL) flows, followed by querying, reporting, and analytic operations, is being generalized to analytic data flows that utilize a variety of data types and operations. These complicated flows are difficult to design, implement and maintain since they span a variety of systems. Additionally, new design requirements may be imposed such as design for fault-tolerance, freshness, maintainability, sampling, etc. To reduce development time and maintenance costs, automation is needed. We present xPAD, our platform to manage analytic data flows. xPAD enables flow design. We show how these designs can be optimized, not just for performance, but for other objectives as well. xPAD is engine-agnostic. We show how it can generate executable code for a number of execution engines. It can also import existing flows from other engines and optimize those flows. In that way, it can transform a flow written for one engine into an optimized flow for a different engine. In our demonstration, we will also use various example flows to show optimization for different objectives and comparison of flow execution on different engines.
Year
DOI
Venue
2013
10.1145/2463676.2465247
SIGMOD Conference
Keywords
Field
DocType
optimized flow,new design requirement,new data source,analytic data flow,flow design,data type,analytic data,different engine,complicated flow,flow execution,analytics,code generation,optimization
Data mining,Computer science,Automation,Code generation,Data type,Sampling (statistics),Analytics,Business intelligence,Maintainability,Database,Executable
Conference
Citations 
PageRank 
References 
2
0.45
5
Authors
3
Name
Order
Citations
PageRank
Alkis Simitsis1166594.62
Kevin Wilkinson291.89
Petar Jovanovic3627.78