Abstract | ||
---|---|---|
What-if analysis focuses on analysis on hypothetical scenarios based on historical data. Therefore, it can provide more meaningful information than classical OLAP (on-line analysis processing) for the users of decision support system. As big data OLAP systems are always based on the computation model of MapReduce, of which the advantage is to handle large data sets in batch-processing mode, however it is not suitable for real-time response requirements. It is a most key step to merge delta-table in the process of what-if. However, classical delta-table merge algorithms are seriously restricted in time and space. Multi-Scenario hypothesis, which is upon historical data in big data analytical processing, needs efficient what-if data view support. Therefore, two novel algorithms based on Bloom filter and distributed cache, which can significantly improve the performance of delta table merging algorithm, are proposed in this paper. Finally, compared with Hive on standard SSB data set, our algorithm, which is based on Bloom filter, is demonstrated to be 30% faster. In the case of smaller delta table, even more improvements can be achieved by the algorithm based on distributed cache. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/CBD.2013.40 | CBD |
Keywords | DocType | ISBN |
What-if Analysis, Hadoop, Delta Table, Bloom Filter, MapReduce, OLAP | Conference | 978-1-4799-3260-3 |
Citations | PageRank | References |
0 | 0.34 | 7 |
Authors | ||
3 |