Title
A Hybrid Model for Chinese Spelling Check.
Abstract
Spelling check for Chinese has more challenging difficulties than that for other languages. A hybrid model for Chinese spelling check is presented in this article. The hybrid model consists of three components: one graph-based model for generic errors and two independently trained models for specific errors. In the graph model, a directed acyclic graph is generated for each sentence, and the single-source shortest-path algorithm is performed on the graph to detect and correct general spelling errors at the same time. Prior to that, two types of errors over functional words (characters) are first solved by conditional random fields: the confusion of “在” (at) (pinyin is zai in Chinese), “再” (again, more, then) (pinyin: zai) and “的” (of) (pinyin: de), “地” (-ly, adverb-forming particle) (pinyin: de), and “得” (so that, have to) (pinyin: de). Finally, a rule-based model is exploited to distinguish pronoun usage confusion: “她” (she) (pinyin: ta), “他” (he) (pinyin: ta), and some other common collocation errors. The proposed model is evaluated on the standard datasets released by the SIGHAN Bake-off shared tasks, giving state-of-the-art results.
Year
DOI
Venue
2017
10.1145/3047405
ACM Trans. Asian & Low-Resource Lang. Inf. Process.
Keywords
Field
DocType
Chinese spelling check,hybrid model,graph model,conditional random field,rule-based model
Conditional random field,Pronoun,Pinyin,Computer science,Speech recognition,Directed acyclic graph,Spelling,Artificial intelligence,Natural language processing,Sentence,Graph model,Collocation
Journal
Volume
Issue
ISSN
16
3
2375-4699
Citations 
PageRank 
References 
6
0.44
44
Authors
5
Name
Order
Citations
PageRank
Hai Zhao1960113.64
Deng Cai2675.96
Yang Xin3297.03
yuzhu wang4146.71
Zhongye Jia5101.17