A Framework for Collocation Error Correction in Web Pages and Text Documents - Citegraph

Paper Info

Title
A Framework for Collocation Error Correction in Web Pages and Text Documents

Abstract
Much of the English in text documents today comes from nonnative speakers. Web searches are also conducted very often by non-native speakers. Though highly qualified in their respective fields, these speakers could potentially make errors in collocation, e.g., \"dark money\" and \"stock agora\" (instead of the more appropriate English expressions \"black money\" and \"stock market\" respectively). These may arise due to literal translation from the respective speaker's native language or other factors. Such errors could cause problems in contexts such as querying over Web pages, correct understanding of text documents and more. This paper proposes a framework called CollOrder to detect such collocation errors and suggest correctly ordered collocated responses for improving the semantics. This framework integrates machine learning approaches with natural language processing techniques, proposing suitable heuristics to provide responses to collocation errors, ranked in the order of correctness. We discuss the proposed framework with algorithms and experimental evaluation in this paper. We claim that it would be useful in semantically enhancing Web querying e.g., financial news, online shopping etc. It would also help in providing automated error correction in machine translated documents and offering assistance to people using ESL tools.

Year	DOI	Venue
2015	10.1145/2830544.2830548	SIGKDD Explorations
Field	DocType	Volume
Data mining,Ranking,Expression (mathematics),Web page,Computer science,Correctness,Heuristics,Literal translation,Natural language processing,Artificial intelligence,Semantics,Collocation	Journal	17
Issue	Citations	PageRank
1	1	0.36
References	Authors
12	4

Authors (4 rows)

Cited by (1 rows)

References (12 rows)

Name	Order	Citations	PageRank
Alan Varghese	1	1	0.36
Aparna S. Varde	2	188	28.71
Jing Peng	3	145	16.57
Eileen Fitzpatrick	4	3	0.73

1