Title
Using Mutual Information to Identify New Features for Text documents of Various Domains
Abstract
The task of identifying proper names, unknown words and new terms, is an important step in text processing systems. This paper describes a method of using mutual information to collect possible segments as candidates of these three feature types in a document scope. Then the construction and context of each possible feature is examined to determine its type, canonical form and meaning. Adding very little domain-specific knowledge, this method adapts to various domains easily.
Year
Venue
Field
2003
PACLIC
Information retrieval,Computer science,Canonical form,Mutual information,Proper noun,Text processing
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
1
1
Name
Order
Citations
PageRank
Zhili guo126412.46