Improved Estimation of Entropy for Evaluation of Word Sense Induction. - Citegraph

Paper Info

Title
Improved Estimation of Entropy for Evaluation of Word Sense Induction.

Abstract
Information-theoretic measures are among the most standard techniques for evaluation of clustering methods including word sense induction (WSI) systems. Such measures rely on sample-based estimates of the entropy. However, the standard maximum likelihood estimates of the entropy are heavily biased with the bias dependent on, among other things, the number of clusters and the sample size. This makes the measures unreliable and unfair when the number of clusters produced by different systems vary and the sample size is not exceedingly large. This corresponds exactly to the setting of WSI evaluation where a ground-truth cluster sense number arguably does not exist and the standard evaluation scenarios use a small number of instances of each word to compute the score. We describe more accurate entropy estimators and analyze their performance both in simulations and on evaluation of WSI systems.

Year	DOI	Venue
2014	10.1162/COLI_a_00196	Computational Linguistics
Field	DocType	Volume
Small number,Cluster (physics),Word-sense induction,Computer science,Maximum likelihood,Artificial intelligence,Cluster analysis,Machine learning,Sample size determination,Estimator	Journal	40
Issue	ISSN	Citations
3	0891-2017	0
PageRank	References	Authors
0.34	14	3

Authors (3 rows)

Cited by (0 rows)

References (14 rows)

Name	Order	Citations	PageRank
Linlin Li	1	117	7.66
Ivan Titov	2	1484	81.98
Caroline Sporleder	3	453	31.84

1