Title
Utilization of Multi-word Expressions to Improve Statistical Machine Translation of Statutory Sentences
Abstract
Statutory sentences are generally difficult to read because of their complicated expressions and length. Such difficulty is one reason for the low quality of statistical machine translation (SMT). Multi-word expressions (MWEs) also complicate statutory sentences and extend their length. Therefore, we proposed a method that utilizes MWEs to improve the SMT system of statutory sentences. In our method, we extracted the monolingual MWEs from a parallel corpus, automatically acquired these translations based on the Dice coefficient, and integrated the extracted bilingual MWEs into an SMT system by the single-tokenization strategy. The experiment results with our SMT system using the proposed method significantly improved the translation quality. Although automatic translation equivalent acquisition using the Dice coefficient is not perfect, the best system's score was close to a system that used bilingual MWEs whose equivalents are translated by hand.
Year
DOI
Venue
2015
10.1007/978-3-319-50953-2_18
Lecture Notes in Artificial Intelligence
Keywords
Field
DocType
Multi-word expressions,Statistical machine translation,Legal information sharing
Rule-based machine translation,Expression (mathematics),Statutory law,Sørensen–Dice coefficient,Computer science,Machine translation,Speech recognition,Natural language processing,Artificial intelligence,Automatic translation
Conference
Volume
ISSN
Citations 
10091
0302-9743
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Satomi Sakamoto100.34
Yasuhiro Ogawa213.08
Makoto Nakamura3287.99
Tomohiro Ohno43110.06
Katsuhiko Toyama53911.41