Title
A Framework for Collocation Error Correction in Web Pages and Text Documents
Abstract
Much of the English in text documents today comes from nonnative speakers. Web searches are also conducted very often by non-native speakers. Though highly qualified in their respective fields, these speakers could potentially make errors in collocation, e.g., \"dark money\" and \"stock agora\" (instead of the more appropriate English expressions \"black money\" and \"stock market\" respectively). These may arise due to literal translation from the respective speaker's native language or other factors. Such errors could cause problems in contexts such as querying over Web pages, correct understanding of text documents and more. This paper proposes a framework called CollOrder to detect such collocation errors and suggest correctly ordered collocated responses for improving the semantics. This framework integrates machine learning approaches with natural language processing techniques, proposing suitable heuristics to provide responses to collocation errors, ranked in the order of correctness. We discuss the proposed framework with algorithms and experimental evaluation in this paper. We claim that it would be useful in semantically enhancing Web querying e.g., financial news, online shopping etc. It would also help in providing automated error correction in machine translated documents and offering assistance to people using ESL tools.
Year
DOI
Venue
2015
10.1145/2830544.2830548
SIGKDD Explorations
Field
DocType
Volume
Data mining,Ranking,Expression (mathematics),Web page,Computer science,Correctness,Heuristics,Literal translation,Natural language processing,Artificial intelligence,Semantics,Collocation
Journal
17
Issue
Citations 
PageRank 
1
1
0.36
References 
Authors
12
4
Name
Order
Citations
PageRank
Alan Varghese110.36
Aparna S. Varde218828.71
Jing Peng314516.57
Eileen Fitzpatrick430.73