Title
STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings
Abstract
Lexical-based metrics such as BLEU, NIST, and WER have been widely used in machine translation (MT) evaluation. However, these metrics badly represent semantic relationships and impose strict identity matching, leading to moderate correlation with human judgments. In this paper, we propose a novel MT automatic evaluation metric <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Semantic Travel Distance</italic> (STD) based on word embeddings. STD incorporates both semantic and lexical features (word embeddings and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</italic> -gram and word order) into one metric. It measures the semantic distance between the hypothesis and reference by calculating the minimum cumulative cost that the embedded <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</italic> -grams of the hypothesis need to “travel” to reach the embedded <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</italic> -grams of the reference. Experiment results show that STD has a better and more robust performance than a range of state-of-the-art metrics for both the segment-level and system-level evaluation.
Year
DOI
Venue
2019
10.1109/TASLP.2019.2922845
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
Field
DocType
Measurement,Semantics,Syntactics,NIST,Speech processing,Earth,Linguistics
Semantic similarity,BLEU,Word order,Computer science,Machine translation,Speech recognition,NIST,Correlation
Journal
Volume
Issue
ISSN
27
10
2329-9290
Citations 
PageRank 
References 
1
0.43
18
Authors
6
Name
Order
Citations
PageRank
Pairui Li110.43
Chuan Chen2549.82
Wujie Zheng325415.92
Yuetang Deng4594.81
Fanghua Ye531.15
Zibin Zheng63731199.37