Title
CSVD: clustering and singular value decomposition for approximate similarity search in high-dimensional spaces
Abstract
Nearest-neighbor search of high-dimensionality spaces is critical for many applications, such as content-based retrieval from multimedia databases, similarity search of patterns in data mining, and nearest-neighbor classification. Unfortunately, even with the aid of the commonly used indexing schemes, the performance of nearest-neighbor (NN) queries deteriorates rapidly with the number of dimensions. We propose a method, called Clustering with Singular Value Decomposition (CSVD), which supports efficient approximate processing of NN queries, while maintaining good precision-recall characteristics. CSVD groups homogeneous points into clusters and separately reduces the dimensionality of each cluster using SVD. Cluster selection for NN queries relies on a branch-and-bound algorithm and within-cluster searches can be performed with traditional or in-memory indexing methods. Experiments with texture vectors extracted from satellite images show that CSVD achieves significantly higher dimensionality reduction than plain SVD for the same normalized mean squared error (NMSE), which translates into a higher efficiency in processing approximate NN queries.
Year
DOI
Venue
2003
10.1109/TKDE.2003.1198398
Knowledge and Data Engineering, IEEE Transactions
Keywords
Field
DocType
data mining,database indexing,mean square error methods,multimedia databases,pattern clustering,query processing,singular value decomposition,tree searching,CSVD,approximate similarity search,branch-and-bound algorithm,clustering,content-based retrieval,data mining,experiments,high-dimensional spaces,indexing,multimedia databases,nearest-neighbor classification,nearest-neighbor search,normalized mean squared error,query processing,satellite images,singular value decomposition
Singular value decomposition,Data mining,Dimensionality reduction,Pattern recognition,Computer science,Search engine indexing,Curse of dimensionality,Artificial intelligence,Database index,Cluster analysis,Nearest neighbor search,Principal component analysis
Journal
Volume
Issue
ISSN
15
3
1041-4347
Citations 
PageRank 
References 
40
2.72
23
Authors
3
Name
Order
Citations
PageRank
Vittorio Castelli1928129.71
alexander thomasian21242357.16
Chung-sheng Li31372222.33