Abstract | ||
---|---|---|
We describe an ensemble approach to learning salient regions from data partitioned according to the distributed processing requirements of large-scale simulations. The volume of the data is such that classifiers can train only on data local to a given partition. Classes will likely be missing from some, or even most, partitions. We combine a fast ensemble learning algorithm with scaled probabilistic majority voting in order to learn an accurate classifier from such data. We order predicted regions to increase the likelihood that most of the initial set of presented regions are salient. Results from a simulated casing being dropped show that regions of interest are successfully identified and ordered. This approach is much faster than manually browsing and visualizing terabyte or larger simulations to find regions of interest. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1109/ICPR.2008.4761265 | Tampa, FL |
Keywords | Field | DocType |
data visualisation,pattern classification,data classifier,data partition,distributed processing requirements,ensemble approach,ensemble learning algorithm,large-scale simulations,salient regions | Data mining,Data modeling,Data visualization,Pattern recognition,Computer science,Terabyte,Artificial intelligence,Probabilistic logic,Statistical classification,Classifier (linguistics),Ensemble learning,Salient | Conference |
ISSN | ISBN | Citations |
1051-4651 | 978-1-4244-2174-9 | 2 |
PageRank | References | Authors |
0.39 | 10 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Larry Shoemaker | 1 | 13 | 1.93 |
Robert E. Banfield | 2 | 358 | 17.16 |
Larry O. Hall | 3 | 5 | 0.78 |
Kevin W. Bowyer | 4 | 11121 | 734.33 |
W. Philip Kegelmeyer | 5 | 3498 | 146.54 |