Title
Learning to Predict Salient Regions from Disjoint and Skewed Training Sets
Abstract
We present an ensemble learning approach that achieves accurate predictions from arbitrarily partitioned data. The partitions come from the distributed processing requirements of a large scale simulation where the volume of the data is such that classifiers can train only on data local to a given partition. As a result of the partition reflecting the need for efficient simulation analysis, rather than the needs of data mining, the class statistics vary across partitions; indeed some classes will likely be absent from some partitions. We combine a fast ensemble learning algorithm with majority voting to generate an accurate working model of the simulation. Results from several simulations show that regions of interest are successfully identified in spite of training set class imbalances. Accuracy is analyzed both at the level of nodes in the simulation data structure, and in terms of higher-level regions of interest. It is shown that over 98% of salient regions are found in independent test sets. Hence, this approach will be a significant time saver for simulation users and developers.
Year
DOI
Venue
2006
10.1109/ICTAI.2006.75
ICTAI
Keywords
Field
DocType
learning (artificial intelligence),class statistics,distributed processing,ensemble learning,large scale simulation
Data mining,Disjoint sets,Computer science,Artificial intelligence,Majority rule,Ensemble learning,Spite,Training set,Data structure,Pattern recognition,Partition (number theory),Machine learning,Salient
Conference
ISSN
ISBN
Citations 
1082-3409
0-7695-2728-0
3
PageRank 
References 
Authors
0.40
5
5
Name
Order
Citations
PageRank
Larry Shoemaker1131.93
Robert E. Banfield235817.16
Lawrence O. Hall35543335.87
Kevin W. Bowyer411121734.33
W. Philip Kegelmeyer53498146.54