Title
Histogram-Based Dimensionality Reduction of Term Vector Space
Abstract
One of the most vital problems of free-text document processing is the curse of dimensionality. The paper presents a dimensionality reduction algorithm based on informed feature selection. Terms describing the document are based on histogram-like statistics which can be computed as well as incrementally updated at low complexity. The document representation can adapt to changing document collection characteristics. Along with the fundamental concepts we present an empirical verification of the approach.
Year
DOI
Venue
2007
10.1109/CISIM.2007.35
Minneapolis, MN
Keywords
Field
DocType
term vector space,histogram-based dimensionality reduction,gaussian distribution,document processing,vector space,principal component analysis,multidimensional systems,feature selection,statistics,frequency,image analysis,computer science,sparse matrices,text analysis,matrix decomposition,curse of dimensionality,empirical verification
Histogram,Dimensionality reduction,Feature selection,Pattern recognition,Computer science,Document clustering,Document processing,Curse of dimensionality,Artificial intelligence,Machine learning,Word processing,Principal component analysis
Conference
ISBN
Citations 
PageRank 
0-7695-2894-5
0
0.34
References 
Authors
8
3
Name
Order
Citations
PageRank
Krzysztof Ciesielski129629.71
Mieczyslaw A. Klopotek236678.58
Slawomir T. Wierzchon336263.62