Abstract | ||
---|---|---|
Semantically tagging a corpus is useful for many intermediate NLP tasks such as: acquisition of word argument structures in sublanguages, ac- quisition of syntactic disambiguation cues, ter- minology learning, etc. Semantic categories al- low the generalization of observed word pat- terns, and facilitate the discovery of irecurrent sublanguage phenomena and selectional rules of various types. Yet, as opposed to POS tags in morphology, there is no consensus in literature about the type and granularity of the category inventory. In addition, most available on-line taxonomies, as WordNet, are over ambiguous and, at the same time, may not include many domain-dependent senses of words. In this pa- per we describe a method to adapt a general purpose taxonomy to an application sub(an- guage: flint, we prune branches of the Wordnet hierarchy that are too " fine grained" for the do- main: then. a statistical model of classes is built from corpus contexts to sort the different classi- fications or assign a classification to known and unknown words, respectively. |
Year | Venue | Keywords |
---|---|---|
1998 | WordNet@ACL/COLING | statistical model |
Field | DocType | Volume |
Pattern recognition,Terminology,Computer science,Information extraction,Natural language processing,Artificial intelligence,Corpus linguistics,Cluster analysis,WordNet,Syntax,Ambiguity,Sublanguage | Conference | W98-07 |
Citations | PageRank | References |
1 | 0.36 | 7 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Roberto Basili | 1 | 1308 | 155.68 |
Alessandro Cucchiarelli | 2 | 226 | 36.38 |
Carlo Consoli | 3 | 2 | 0.75 |
Maria Teresa Pazienza | 4 | 704 | 144.36 |
paola velardi | 5 | 1553 | 163.66 |