Title
Tree Based Shape Similarity Measurement for Chinese Characters.
Abstract
In Chinese, there are many characters which are similar in shape, and this phenomenon usually induces writing errors. As one important issue in spelling automatic correction, shape similarity measurement is still a challenging problem. To address this issue, we propose a component-tree based method in this paper, which is based on the hypothesis "characters are similar if their construction and components are both similar". Firstly, we decompose each character to a tree recursively, in which the root node is the character and the leaf nodes are atomic parts, called strokes. Then, we align any pair of trees using their minimal super-tree and calculate their similarity from bottom to up based on weighted edit distance. Finally, the cognitive prominence is used to adjust the similarity scores. In text proofreading experiments, our method achieved 97% precision and 95.6% recall, which can be applied in practical systems.
Year
DOI
Venue
2015
10.1007/978-3-319-25159-2_26
Lecture Notes in Artificial Intelligence
Keywords
Field
DocType
Shape similarities,Chinese characters components,Cognitive similarity,Automatic text proofreading
Edit distance,Chinese characters,Pattern recognition,Computer science,Spelling,Artificial intelligence,Phenomenon,Machine learning,Recursion
Conference
Volume
ISSN
Citations 
9403
0302-9743
0
PageRank 
References 
Authors
0.34
3
3
Name
Order
Citations
PageRank
Ya-nan Cao113119.42
Shi Wang22812.46
Cungen Cao330958.63