Title | ||
---|---|---|
Label-Aware Distributed Ensemble Learning: A Simplified Distributed Classifier Training Model for Big Data. |
Abstract | ||
---|---|---|
Label-Aware Distributed Ensemble Learning (LADEL) is a programming model and an associated implementation for distributing any classifier training to handle Big Data. It only requires users to specify the training data source, the classification algorithm and the desired parallelization level. First, a distributed stratified sampling algorithm is proposed to generate stratified samples from large, pre-partitioned datasets in a shared-nothing architecture. It executes in a single pass over the data and minimizes inter-machine communication. Second, the specified classification algorithm training is parallelized and executed on any number of heterogeneous machines. Finally, the trained classifiers are aggregated to produce the final classifier. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1016/j.bdr.2018.11.001 | Big Data Research |
Keywords | Field | DocType |
Big Data,Analytics,Distributed,Machine learning,Classification | Data mining,Spark (mathematics),Programming paradigm,Computer science,Rewriting,Classifier (linguistics),Statistical classification,Big data,Ensemble learning,Speedup | Journal |
Volume | ISSN | Citations |
15 | 2214-5796 | 0 |
PageRank | References | Authors |
0.34 | 16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shady Khalifa | 1 | 5 | 2.27 |
patrick martin | 2 | 148 | 18.22 |
Rebecca Young | 3 | 0 | 0.68 |