Measuring semantic similarity between words by removing noise and redundancy in web snippets - Citegraph

Paper Info

Title
Measuring semantic similarity between words by removing noise and redundancy in web snippets

Abstract
Semantic similarity measures play important roles in many Web-related tasks such as Web browsing and query suggestion. Because taxonomy-based methods can not deal with continually emerging words, recently Web-based methods have been proposed to solve this problem. Because of the noise and redundancy hidden in the Web data, robustness and accuracy are still challenges. In this paper, we propose a method integrating page counts and snippets returned by Web search engines. Then, the semantic snippets and the number of search results are used to remove noise and redundancy in the Web snippets (‘Web-snippet’ includes the title, summary, and URL of a Web page returned by a search engine). After that, a method integrating page counts, semantics snippets, and the number of already displayed search results are proposed. The proposed method does not need any human annotated knowledge (e.g., ontologies), and can be applied Web-related tasks (e.g., query suggestion) easily. A correlation coefficient of 0.851 against Rubenstein–Goodenough benchmark dataset shows that the proposed method outperforms the existing Web-based methods by a wide margin. Moreover, the proposed semantic similarity measure significantly improves the quality of query suggestion against some page counts based methods. Copyright © 2011 John Wiley & Sons, Ltd.

Year	DOI	Venue
2011	10.1002/cpe.1816	Concurrency and Computation: Practice and Experience
Keywords	DocType	Volume
semantic similarity,web snippet,Web search engine,Web data,page count,Web snippet,Web page,Web-related task,query suggestion,Web browsing,search result,proposed method	Journal	23
Issue	ISSN	Citations
18	1532-0626	20
PageRank	References	Authors
1.00	26	4

Authors (4 rows)

Cited by (20 rows)

References (26 rows)

Name	Order	Citations	PageRank
Zheng Xu	1	352	19.51
Xiangfeng Luo	2	1251	124.38
Jie Yu	3	20	1.00
Weimin Xu	4	61	7.98

1