Title
A Study On The Stability And Effectiveness Of Features In Quality Estimation For Spoken Language Translation
Abstract
A quality estimation (QE) approach informed with machine translation (MT) and speech recognition (ASR) features has recently shown to improve the performance of a spoken language translation (SLT) system in an in-domain scenario. When domain mismatch is progressively introduced in the MT and ASR systems, the SLT system's performance naturally degrades. The use of QE to improve SLT performance has not been studied in this context. In this paper we investigate the effectiveness of QE under this setting. Our experiments showed that across moderate levels of domain mismatches, QE led to consistent translation improvements of around 0.4 in BLEU score. The QE system relies on 116 features derived from the ASR and MT system input and output. Feature analysis was conducted to understand the information sources contributing the most to performance improvements. LDA dimension reduction was used to summarise effective features into sets as small as 3 without affecting the SLT performance. By inspecting the principal components, eight features including the acoustic model scores and count-based word statistics on the bilingual text were found to be critically important, leading to a further boost of around 0.1 BLEU score over the full set of features. These findings provide interesting possibilities for further work by incorporating the effective QE features in SLT system training or decoding.
Year
Venue
Keywords
2015
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
Spoken language translation, quality estimation, system robustness
Field
DocType
Citations 
Spoken language translation,Dimensionality reduction,Computer science,Machine translation,Input/output,Speech recognition,Natural language processing,Artificial intelligence,Decoding methods,Principal component analysis,Pattern recognition (psychology),Acoustic model
Conference
0
PageRank 
References 
Authors
0.34
10
4
Name
Order
Citations
PageRank
Raymond W. M. Ng134021.61
Kashif Shah210311.69
lucia specia31217122.84
Thomas Hain4184.50