Title
Depth-Based Novelty Detection and Its Application to Taxonomic Research
Abstract
It is estimated that less than 10 percent of the world's species have been described, yet species are being lost daily due to human destruction of natural habitats. The job of describing the earth's remaining species is exacerbated by the shrinking number of practicing taxonomists and the very slow pace of traditional taxonomic research. In this article, we tackle, from a novelty detection perspective, one of the most important and challenging research objectives in taxonomy - new species identification. We propose a unique and efficient novelty detection framework based on statistical depth functions. Statistical depth functions provide from the "deepest" point a "center-outward ordering" of multidimensional data. In this sense, they can detect observations that appear extreme relative to the rest of the observations, i.e., novelty. Of the various statistical depths, the spatial depth is especially appealing because of its computational efficiency and mathematical tractability. We propose a novel statistical depth, the kernelized spatial depth (KSD) that generalizes the spatial depth via positive definite kernels. By choosing a proper kernel, the KSD can capture the local structure of a data set while the spatial depth fails. Observations with depth values less than a threshold are declared as novel. The proposed algorithm is simple in structure: the threshold is the only one parameter for a given kernel. We give an upper bound on the false alarm probability of a depth-based detector, which can be used to determine the threshold. Experimental study demonstrates its excellent potential in new species discovery.
Year
DOI
Venue
2007
10.1109/ICDM.2007.10
ICDM
Keywords
Field
DocType
depth-based novelty detection,statistical depth function,remaining species,depth-based detector,false alarm probability,taxonomic research,kernelized spatial depth,multidimensional data,learning (artificial intelligence),novel statistical depth,statistical depth functions,efficient novelty detection framework,mathematical tractability,taxonomy new species identification,various statistical depth,biology computing,new species identification,new species discovery,novelty detection perspective,data mining,machine learning,spatial depth,zoology,center-outward ordering,probability,upper bound,learning artificial intelligence,data gathering,body shape,geographic range,positive definite kernel
Data mining,Novelty detection,False alarm,Upper and lower bounds,Computer science,Artificial intelligence,Detector,Kernel (linear algebra),Pattern recognition,Positive-definite matrix,Local structure,Novelty,Machine learning
Conference
ISSN
ISBN
Citations 
1550-4786
978-0-7695-3018-5
1
PageRank 
References 
Authors
0.38
15
4
Name
Order
Citations
PageRank
Yixin Chen14326299.19
Henry L. Bart Jr.261.54
Xin Dang31399.85
Hanxiang Peng4522.62