Title | ||
---|---|---|
Comparing the sentence alignment yield from two news corpora using a dictionary-based alignment system |
Abstract | ||
---|---|---|
Corpus-based MT requires the input of large sentence aligned bilingual corpora, but these are hard to find for Japanese. Bilingual news corpora seem to offer a useful resource for Machine Translation, but their quality is variable. Sentence alignments produced by filtering literal word translations from the NHK corpus yield disappointing results, though correlating NP translations performs better. Using this method gets even better results from the Nikkei corpus. This paper reports sentence alignment results from 2 corpora, in a 2-pass dictionary based alignment system. |
Year | DOI | Venue |
---|---|---|
2003 | 10.3115/1118905.1118926 | ParallelTexts@NAACL-HLT |
Keywords | Field | DocType |
alignment system,bilingual corpus,better result,sentence alignment yield,nikkei corpus,dictionary-based alignment system,2-pass dictionary,nhk corpus,corpus-based mt,bilingual news corpus,large sentence,paper reports sentence alignment | Computer science,Machine translation,Filter (signal processing),Speech recognition,Natural language processing,Artificial intelligence,Sentence | Conference |
Citations | PageRank | References |
0 | 0.34 | 8 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Stephen Nightingale | 1 | 1 | 1.39 |
Hideki Tanaka | 2 | 80 | 15.07 |