Title
Conditional Sentence Generation and Cross-Modal Reranking for Sign Language Translation
Abstract
Sign Language Translation (SLT) aims to generate spoken language translations from sign language videos. Currently, the available sign language datasets are relatively too small to learn the linguistic properties of spoken language. In this paper, towards effective SLT, we propose a novel framework which takes the advantage of the spoken language grammar learnt from a large corpus of text sentences. Our framework consists of three key modules: word existence verification, conditional sentence generation and cross-modal re-ranking. We first check the existence of words in the vocabulary by a series of binary classification in parallel. After that, the appearing words are assembled and guided by a pretrained spoken language generator to produce multiple candidate sentences in spoken language manner. Last but not least, we select the sentence most semantically similar to the input sign video as the translation result with a crossmodal re-ranking model. We evaluate our framework on two large scale continuous SLT benchmarks, i.e., CSL and RWTHPHOENIX-Weather 2014 T. Experimental results demonstrate that the proposed framework achieves promising performance on both datasets.
Year
DOI
Venue
2022
10.1109/TMM.2021.3087006
IEEE TRANSACTIONS ON MULTIMEDIA
Keywords
DocType
Volume
Assistive technology, Videos, Gesture recognition, Feature extraction, Task analysis, Linguistics, Training, Sign language translation, conditional sentence generation, cross-modal reranking
Journal
24
ISSN
Citations 
PageRank 
1520-9210
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Jian Zhao100.34
Weizhen Qi200.34
Wengang Zhou3122679.31
Nan Duan421345.87
Ming Zhou54262251.74
Houqiang Li62090172.30