Title
Predicting Tongue Motion In Unlabeled Ultrasound Videos Using Convolutional Lstm Neural Networks
Abstract
A challenge in speech production research is to predict future tongue movements based on a short period of past tongue movements. This study tackles speaker-dependent tongue motion prediction problem in unlabeled ultrasound videos with convolutional long short-term memory ( ConvLSTM) networks. The model has been tested on two different ultrasound corpora. ConvLSTM outperforms 3-dimensional convolutional neural network ( 3DCNN) in predicting the 9th frames based on 8 preceding frames, and also demonstrates good capacity to predict only the tongue contours in future frames. Further tests reveal that ConvLSTM can also learn to predict tongue movements in more distant frames beyond the immediately following frames. Our codes are available at: https://github.com/shuiliwanwu/ConvLstm-ultrasoundvideos.
Year
Venue
Keywords
2019
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
convolutional recurrent neural network, motion prediction, speech production, ultrasound tongue imaging, silent speech interface
Field
DocType
Volume
Task analysis,Pattern recognition,Convolutional neural network,Computer science,Linear programming,Artificial intelligence,Silent speech interface,Artificial neural network,Speech production,Tongue,Ultrasound
Journal
abs/1902.06927
ISSN
Citations 
PageRank 
1520-6149
0
0.34
References 
Authors
4
6
Name
Order
Citations
PageRank
Chaojie Zhao100.34
Peng Zhang200.34
Jian Zhu3154.11
Chengrui Wu400.34
Wang Huaimin51025121.31
Kele Xu64621.80