Abstract | ||
---|---|---|
A verb paradigm is a set of inflectional cate- gories for a single verb lemma. To obtain verb paradigms we extracted left and right bigrams for the 400 most frequent verbs from over 100 million words of text, calculated the Kullback Leibler distance for each pair of verbs for left and right contexts separately, and ran a hier- archical clustering algorithm for each context. Our new method for finding unsupervised cut points in the cluster trees produced results that compared favorably with results obtained using supervised methods, such as gain ratio, a re- vised gain ratio and number of correctly classi- fied items. Left context clusters correspond to inflectional categories, and right context clus- ters correspond to verb lemmas. For our test data, 91.5% of the verbs are correctly classi- fied for inflectional category, 74.7% are correctly classified for lemma, and the correct joint classi- fication for lemma and inflectional category was obtained for 67.5% of the verbs. These results are derived only from distributional information without use of morphological information. |
Year | Venue | Field |
---|---|---|
1998 | VLC@COLING/ACL | Verb,Computer science,Artificial intelligence,Natural language processing |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
3 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Cornelia H. Parkes | 1 | 0 | 0.34 |
Alexander M. Malek | 2 | 0 | 0.34 |
Mitchell P. Marcus | 3 | 3098 | 854.76 |