Title
Label-Aware Distributed Ensemble Learning: A Simplified Distributed Classifier Training Model for Big Data.
Abstract
Label-Aware Distributed Ensemble Learning (LADEL) is a programming model and an associated implementation for distributing any classifier training to handle Big Data. It only requires users to specify the training data source, the classification algorithm and the desired parallelization level. First, a distributed stratified sampling algorithm is proposed to generate stratified samples from large, pre-partitioned datasets in a shared-nothing architecture. It executes in a single pass over the data and minimizes inter-machine communication. Second, the specified classification algorithm training is parallelized and executed on any number of heterogeneous machines. Finally, the trained classifiers are aggregated to produce the final classifier.
Year
DOI
Venue
2019
10.1016/j.bdr.2018.11.001
Big Data Research
Keywords
Field
DocType
Big Data,Analytics,Distributed,Machine learning,Classification
Data mining,Spark (mathematics),Programming paradigm,Computer science,Rewriting,Classifier (linguistics),Statistical classification,Big data,Ensemble learning,Speedup
Journal
Volume
ISSN
Citations 
15
2214-5796
0
PageRank 
References 
Authors
0.34
16
3
Name
Order
Citations
PageRank
Shady Khalifa152.27
patrick martin214818.22
Rebecca Young300.68