Abstract | ||
---|---|---|
This paper presents a wide range of statistical word alignment experiments incorporating morphosyntactic information. By means of parallel corpus transformations according to information of POS-tagging, lemmatization or stemming, we explore which linguistic information helps improve alignment error rates. For this, evaluation against a human word alignment reference is performed, aiming at an improved machine translation training scheme which eventually leads to improved SMT performance. Experiments are carried out in a Spanish–English European Parliament Proceedings parallel corpus, both in a large and a small data track. As expected, improvements due to introducing morphosyntactic information are bigger in case of data scarcity, but significant improvement is also achieved in a large data task, meaning that certain linguistic knowledge is relevant even in situations of large data availability. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1007/11816508_38 | FinTAL |
Keywords | Field | DocType |
data scarcity,morphosyntactic information,morpho-syntactic transformation,large data availability,alignment error rate,improving statistical word alignment,human word alignment reference,parallel corpus,small data track,statistical word alignment experiment,large data task,linguistic information,machine translation | Rule-based machine translation,Lemmatisation,Small data,Computer science,Machine translation,Computational linguistics,Natural language,Natural language processing,Artificial intelligence,Parsing,Syntax | Conference |
Volume | ISSN | ISBN |
4139 | 0302-9743 | 3-540-37334-9 |
Citations | PageRank | References |
5 | 0.51 | 17 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Adrià de Gispert | 1 | 472 | 35.22 |
Deepa Gupta | 2 | 75 | 11.91 |
Maja Popović | 3 | 169 | 13.09 |
Patrik Lambert | 4 | 277 | 23.36 |
José B. Mariño | 5 | 510 | 64.66 |
marcello federico | 6 | 2420 | 179.56 |
Hermann Ney | 7 | 14178 | 1506.93 |
Rafael Banchs | 8 | 120 | 8.91 |