Title
Parsing a natural language using mutual information statistics
Abstract
The purpose of this paper is to characterize a constituent boundary parsing algorithm, using an information-theoretic measure called generalized mutual information, which serves as an alternative to traditional grammar-based parsing methods. This method is based on the hypothesis that constituent boundaries can be extracted from a given sentence (or word sequence) by analyzing the mutual information values of the part of speech n-grams within the sentence. This hypothesis is supported by the performance of an implementation of this parsing algorithm which determines a recursive unlabeled bracketing of unrestricted English text with a relatively low error rate. This paper derives the generalized mutual information statistic, describes the parsing algorithm, and presents results and sample output from the parser.
Year
Venue
Keywords
1990
AAAI
recursive unlabeled bracketing,low error rate,sample output,constituent boundary,mutual information,parsing algorithm,mutual information value,natural language,traditional grammar-based parsing method,information-theoretic measure,generalized mutual information statistic
Field
DocType
ISBN
Top-down parsing language,Top-down parsing,S-attributed grammar,Computer science,Speech recognition,Bottom-up parsing,Parsing expression grammar,Artificial intelligence,Natural language processing,Parsing,Parser combinator,Pointwise mutual information
Conference
0-262-51057-X
Citations 
PageRank 
References 
52
97.13
3
Authors
2
Name
Order
Citations
PageRank
David M. Magerman1726512.15
Mitchell P. Marcus23098854.76