Finding predominant word senses in untagged text - Citegraph

Paper Info

Title
Finding predominant word senses in untagged text

Abstract
In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The problem with using the predominant, or first sense heuristic, aside from the fact that it does not take surrounding context into account, is that it assumes some quantity of hand-tagged data. Whilst there are a few hand-tagged corpora available for some languages, one would expect the frequency distribution of the senses of words, particularly topical words, to depend on the genre and domain of the text under consideration. We present work on the use of a thesaurus acquired from raw textual corpora and the WordNet similarity package to find predominant noun senses automatically. The acquired predominant senses give a precision of 64% on the nouns of the SENSEVAL-2 English all-words task. This is a very promising result given that our method does not require any hand-tagged text, such as SemCor. Furthermore, we demonstrate that our method discovers appropriate predominant senses for words from two domain-specific corpora.

Year	DOI	Venue
2004	10.3115/1218955.1218991	ACL
Keywords	Field	DocType
topical word,word sense disambiguation,hand-tagged text,predominant noun sense,predominant word sense,predominant sense,untagged text,appropriate predominant sense,frequency distribution,sense heuristic,hand-tagged data,common sense,noun	Heuristic,Common sense,Computer science,Noun,Artificial intelligence,Natural language processing,WordNet,Linguistics,Word-sense disambiguation,Aside	Conference
Volume	Citations	PageRank
P04-1	163	11.06
References	Authors
19	4

Search Limit

100163

Authors (4 rows)

Cited by (100 rows)

References (19 rows)

Name	Order	Citations	PageRank
Diana McCarthy	1	1020	73.34
Rob Koeling	2	434	38.38
Julie Weeds	3	541	34.97
John Carroll	4	1971	222.19

1