Using Mutual Information to Identify New Features for Text documents of Various Domains - Citegraph

Paper Info

Title
Using Mutual Information to Identify New Features for Text documents of Various Domains

Abstract
The task of identifying proper names, unknown words and new terms, is an important step in text processing systems. This paper describes a method of using mutual information to collect possible segments as candidates of these three feature types in a document scope. Then the construction and context of each possible feature is examined to determine its type, canonical form and meaning. Adding very little domain-specific knowledge, this method adapts to various domains easily.

Year	Venue	Field
2003	PACLIC	Information retrieval,Computer science,Canonical form,Mutual information,Proper noun,Text processing
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
1	1

Authors (1 rows)

Cited by (0 rows)

References (1 rows)

Name	Order	Citations	PageRank
Zhili guo	1	264	12.46

1