Title
Term Filtering with Bounded Error
Abstract
In this paper, we consider a novel problem referred to as term filtering with bounded error to reduce the term (feature) space by eliminating terms without (or with bounded) information loss. Different from existing works, the obtained term space provides a complete view of the original term space. More interestingly, several important questions can be answered such as: 1) how different terms interact with each other and 2) how the filtered terms can be represented by the other terms. We perform a theoretical investigation of the term filtering problem and link it to the Geometric Covering By Discs problem, and prove its NP-hardness. We present two novel approaches for both lossless and lossy term filtering with bounds on the introduced error. Experimental results on multiple text mining tasks validate the effectiveness of the proposed approaches.
Year
DOI
Venue
2010
10.1109/ICDM.2010.131
ICDM
Keywords
Field
DocType
geometric covering,lossy term filtering,different terms interact,np-hardness,information filtering,lossless term filtering,original term space,feature space reduction,discs problem,term space,lossy term,computational complexity,filtered term,term filtering,multiple text mining tasks,bounded error,data mining,term space reduction,novel approach,information loss,feature space,filtering,correlation,measurement,text mining
Data mining,Information loss,Lossy compression,Computer science,Filter (signal processing),Filtering problem,Correlation,Artificial intelligence,Bounded error,Machine learning,Computational complexity theory,Bounded function
Conference
ISSN
ISBN
Citations 
1550-4786 E-ISBN : 978-0-7695-4256-0
978-0-7695-4256-0
0
PageRank 
References 
Authors
0.34
30
4
Name
Order
Citations
PageRank
Zi Yang1335.48
W. Li2196.15
Jie Tang35871300.22
Juanzi Li42526154.08