Abstract | ||
---|---|---|
We describe a generative probabilistic model of natural language, which we call HBG, that takes advantage of detailed linguistic information to resolve ambiguity. HBG incorporates lexical, syntactic, semantic, and structural information from the parse tree into the disambiguation process in a novel way. We use a corpus of bracketed sentences, called a Treebank, in combination with decision tree building to tease out the relevant aspects of a parse tree that will determine the correct parse of a sentence. This stands in contrast to the usual approach of further grammar tailoring via the usual linguistic introspection in the hope of generating the correct parse. In head-to-head tests against one of the best existing robust probabilistic parsing models, which we call P-CFG, the HBG model significantly outperforms P-CFG, increasing the parsing accuracy rate from 60% to 75%, a 37% reduction in error. |
Year | DOI | Venue |
---|---|---|
1993 | 10.3115/981574.981579 | meeting of the association for computational linguistics |
Keywords | DocType | Volume |
usual approach,richer model,detailed linguistic information,decision tree building,towards history-based grammar,structural information,generative probabilistic model,correct parse,parsing accuracy rate,existing robust probabilistic,hbg model,parse tree,probabilistic parsing,decision tree,natural language,probabilistic model | Conference | abs/cmp-lg/9405007 |
ISSN | ISBN | Citations |
Proceedings, DARPA Speech and Natural Language Workshop, 1992 | 1-55860-272-0 | 103 |
PageRank | References | Authors |
64.03 | 9 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ezra Black | 1 | 338 | 226.66 |
frederick jelinek | 2 | 125 | 74.24 |
John D. Lafferty | 3 | 14904 | 1772.53 |
David M. Magerman | 4 | 726 | 512.15 |
Robert Mercer | 5 | 131 | 74.05 |
Salim Roukos | 6 | 6248 | 845.50 |