Title
Decision forests for machine learning classification of large, noisy seafloor feature sets.
Abstract
Extremely randomized trees (ET) classifiers, an extension of random forests (RF) are applied to classification of features such as seamounts derived from bathymetry data. This data is characterized by sparse training data from by large noisy features sets such as often found in other geospatial data. A variety of feature metrics may be useful for this task and we use a large number of metrics relevant to the task of finding seamounts. The major significant results to be described include: an outstanding seamount classification accuracy of 97%; an automated process to produce the most useful classification features that are relevant to geophysical scientists (as represented by the feature metrics); demonstration that topography provides the most important data representation for classification. As well as achieving good accuracy in classification, the human-understandable set of metrics generated by the classifier that are most relevant for the results are discussed. High accuracy of classification of seamounts achieved.The use of extreme classifiers is shown to perform well.Feature metric importances are provided and explain basis of classification.
Year
DOI
Venue
2017
10.1016/j.cageo.2016.10.013
Computers & Geosciences
Keywords
Field
DocType
Bathymetry,Topography,Seamounts,Random forests,Extremely randomized trees
Geospatial analysis,Training set,Data mining,External Data Representation,Pattern recognition,Computer science,Bathymetry,Artificial intelligence,Classifier (linguistics),Statistical classification,Random forest
Journal
Volume
Issue
ISSN
99
C
0098-3004
Citations 
PageRank 
References 
0
0.34
11
Authors
5
Name
Order
Citations
PageRank
Ed Lawson100.34
Denson Smith200.34
Donald A. Sofge39524.77
Paul Elmore4204.71
Frederick E. Petry556269.24