Abstract | ||
---|---|---|
This paper addresses the problem of real-word spell checking, i.e., the detection and correction of typos that result in real words of the target language. This paper proposes a methodology based on a mixed trigrams language model. The model has been implemented, trained, and tested with data from the Penn Treebank. The approach has been evaluated in terms of hit rate, false positive rate, and coverage. The experiments show promising results with respect to the hit rates of both detection and correction, even though the false positive rate is still high. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1007/978-3-540-70939-8_55 | CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing |
Keywords | Field | DocType |
hit rate,mixed trigrams approach,mixed trigrams language model,penn treebank,real-word spell checking,target language,real word,false positive rate,context sensitive spell checking,language model | Hit rate,False positive rate,Computer science,Trigram,Speech recognition,Treebank,Artificial intelligence,Natural language processing,Spell,Language model | Conference |
Volume | ISSN | Citations |
4394 | 0302-9743 | 15 |
PageRank | References | Authors |
0.83 | 13 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Davide Fossati | 1 | 170 | 19.62 |
Barbara Di Eugenio | 2 | 35 | 2.71 |