Abstract | ||
---|---|---|
The lack of computational support has significantly slowed down automatic understanding of endangered languages. In this paper, we take Nyushu (simplified Chinese: 女书; literally: “women’s writing”) as a case study to present the first computational approach that combines Computer Vision and Natural Language Processing techniques to deeply understand an endangered language. We developed an end-to-end system to read a scanned hand-written Nyushu article, segment it into characters, link them to standard characters, and then translate the article into Mandarin Chinese. We propose several novel methods to address the new challenges introduced by noisy input and low resources, including Nyushu-specific feature selection for character segmentation and linking, and character linking lattice based Machine Translation. The end-to-end system performance indicates that the system is a promising approach and can serve as a standard benchmark. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2857052 | ACM Trans. Asian & Low-Resource Lang. Inf. Process. |
Keywords | Field | DocType |
Endangered languages,nyushu,recognition,translation,Endangered languages,nyushu,recognition,translation | Endangered species,Feature selection,Computer science,Segmentation,Machine translation,Endangered language,Speech recognition,Natural language processing,Artificial intelligence,Mandarin Chinese | Journal |
Volume | Issue | ISSN |
15 | 4 | 2375-4699 |
Citations | PageRank | References |
1 | 0.37 | 18 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tongtao Zhang | 1 | 1 | 0.37 |
Aritra Chowdhury | 2 | 2 | 1.05 |
Nimit Dhulekar | 3 | 14 | 3.08 |
Jinjing Xia | 4 | 1 | 0.37 |
Kevin Knight | 5 | 5096 | 462.44 |
Heng Ji | 6 | 1544 | 127.27 |
Bülent Yener | 7 | 1075 | 94.51 |
Liming Zhao | 8 | 1 | 0.37 |