Arabic spelling error detection and correction - Citegraph

Paper Info

Title
Arabic spelling error detection and correction

Abstract
A spelling error detection and correction application is typically based on three main components: a dictionary (or reference word list), an error model and a language model. While most of the attention in the literature has been directed to the language model, we show how improvements in any of the three components can lead to significant cumulative improvements in the overall performance of the system. We develop our dictionary of 9.2 million fully-inflected Arabic words (types) from a morphological transducer and a large corpus, validated and manually revised. We improve the error model by analyzing error types and creating an edit distance re-ranker. We also improve the language model by analyzing the level of noise in different data sources and selecting an optimal subset to train the system on. Testing and evaluation experiments show that our system significantly outperforms Microsoft Word 2013, OpenOffice Ayaspell 3.4 and Google Docs.

Year	DOI	Venue
2016	10.1017/S1351324915000030	NATURAL LANGUAGE ENGINEERING
Field	DocType	Volume
Edit distance,Arabic,Computer science,Speech recognition,Error detection and correction,Artificial intelligence,Natural language processing,Spelling,Word processing,Language model	Journal	22
Issue	ISSN	Citations
5.0	1351-3249	3
PageRank	References	Authors
0.40	10	5

Authors (5 rows)

Cited by (3 rows)

References (10 rows)

Name	Order	Citations	PageRank
Mohammed Attia	1	146	16.51
Pavel Pecina	2	558	52.31
Samih Younes	3	38	11.26
Khaled F. Shaalan	4	506	39.80
Josef van Genabith	5	1037	105.64

1