Title
Using classifier ensembles to label spatially disjoint data
Abstract
We describe an ensemble approach to learning from arbitrarily partitioned data. The partitioning comes from the distributed processing requirements of a large scale simulation. The volume of the data is such that classifiers can train only on data local to a given partition. As a result of the partition reflecting the needs of the simulation, the class statistics can vary from partition to partition. Some classes will likely be missing from some partitions. We combine a fast ensemble learning algorithm with probabilistic majority voting in order to learn an accurate classifier from such data. Results from simulations of an impactor bar crushing a storage canister and from facial feature recognition show that regions of interest are successfully identified in spite of the class imbalance in the individual training sets.
Year
DOI
Venue
2008
10.1016/j.inffus.2007.08.001
Information Fusion
Keywords
Field
DocType
random forest,partitioned data,imbalanced training data,k-nearest centroids,impactor bar,accurate classifier,probabilistic voting,individual training set,class statistic,saliency,spatially disjoint data,fast ensemble,classifier ensemble,large scale simulation,facial feature recognition show,k -nearest centroids,ensemble approach,class imbalance,out-of-partition,region of interest,ensemble learning,majority voting,distributed processing
Data mining,Disjoint sets,Computer science,Artificial intelligence,Probabilistic logic,Random forest,Classifier (linguistics),Ensemble learning,Pattern recognition,Feature recognition,Partition (number theory),Partition refinement,Machine learning
Journal
Volume
Issue
ISSN
9
1
Information Fusion
Citations 
PageRank 
References 
5
0.42
23
Authors
5
Name
Order
Citations
PageRank
Larry Shoemaker1131.93
Robert E. Banfield235817.16
Lawrence O. Hall35543335.87
Kevin W. Bowyer411121734.33
W. Philip Kegelmeyer53498146.54