Title
Full And Partial Data Cube Computation And Representation Over Commodity Pcs
Abstract
The PnP (Pipe 'n Prune) approach is considered one of the most promising approaches for partial cube computation over distributed memory computer architectures, however it generates a huge amount of redundant data. In general, PnP does not consider data uniformity, named skew, when partitioning its workload and, thus, it imposes a maximum data redundancy even with uniform data. Due to this scenario, we implement P2CDM (acronym of Parallel Cube Computation with Distributed Memory) approach which has minimized communication and low data redundancy. Globally, at the entire cluster, P2CDM automatically generates data redundancy only for skewed values among all dimensions of a Data Warehouse. Locally, at each host, P2CDM provides cube cells pruning using MCG approach. The result is a distributed approach that computes massive full or partial data cubes over a cluster of commodity PCs. The experiments demonstrated that both approaches have similar speedup, but P2CDM approach is 20-25% faster and consumes 30-40% less memory at each host of the cluster, when compared to PnP approach.
Year
DOI
Venue
2012
10.1109/IRI.2012.6303074
2012 IEEE 13TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI)
Keywords
Field
DocType
silver,data warehouses,data structures,memory management,distributed databases,redundancy
Data warehouse,Data mining,Computer science,Memory management,Data redundancy,Redundancy (engineering),Artificial intelligence,Speedup,Data structure,Parallel computing,Distributed memory,Data cube,Machine learning
Conference
Citations 
PageRank 
References 
0
0.34
14
Authors
2
Name
Order
Citations
PageRank
Angelica Aparecida Moreira100.34
Joubert de Castro Lima2167.23