Abstract | ||
---|---|---|
This paper proposes a novel approach to spelling correction. It reranks the output of an existing spelling corrector, Aspell. A discriminative model (Ranking SVM) is employed to improve upon the initial ranking, using additional features as evidence. These features are derived from state-of-the-art techniques in spelling correction, including edit distance, letter-based n-gram, phonetic similarity and noisy channel model. This paper also presents a new method to automatically extract training samples from the query log chain. The system outperforms the baseline Aspell greatly. as well as previous models and several off-the-shelf spelling correction systems (e.g. Microsoft Word 2003). The results on query chain pairs are comparable to that based on manually-annotated pairs, with 32.2%/32.6% reduction in error rate, respectively. |
Year | Venue | Field |
---|---|---|
2006 | PACLIC 20: PROCEEDINGS OF THE 20TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION | Edit distance,Ranking SVM,Ranking,Computer science,Word error rate,Speech recognition,Artificial intelligence,Natural language processing,Spelling,Noisy channel model,Discriminative model,Word processing |
DocType | Citations | PageRank |
Conference | 5 | 0.55 |
References | Authors | |
13 | 5 |