Title
Realistic mouth animation based on an articulatory DBN model with constrained asynchrony
Abstract
In this paper, we propose an approach to convert acoustic speech to video realistic mouth animation based on an articulatory dynamic Bayesian network model with constrained asynchrony (AF_AVDBN). Conditional probability distributions are defined to control the asynchronies between the articulators such as lips, tongue and glottis/velum. An EM-based conversion algorithm is also presented to learn the optimal visual features given an auditory input and the trained AF_AVDBN parameters. In the training of the AF_AVDBN models, downsampled YUV spatial frequency features of the interpolated mouth image sequences are extracted as visual features. For reproducing the mouth animation sequence, from the learned visual features, a spatial upsampling and a temporal downsampling are applied. Both qualitative and quantitative results show that the proposed method is capable of producing more natural and realistic mouth animations, and the accuracy is further improved compared to the state of the art multi-stream Hidden Markov Model (MSHMM) and articulatory DBN model without asynchrony constraint (AF_DBN).
Year
DOI
Venue
2010
10.1109/ICASSP.2010.5494894
ICASSP
Keywords
Field
DocType
constrained asynchrony,statistical distributions,belief networks,speech processing,em-based conversion algorithm,optimal visual features,multistream hidden markov model,interpolation,hearing,computer animation,asynchrony,articulatory dynamic bayesian network model,af-dbn,interpolated mouth image sequences,af_avdbn,af-avdbn,feature extraction,acoustic speech,image sequences,constraint handling,spatial upsampling,video realistic mouth animation,probability distributions,af_dbn,temporal downsampling,mouth animation,conditional probability distribution,auditory input,hidden markov models,downsampled yuv spatial frequency,articulatory dbn model,dynamic bayesian network,visualization,animation,conditional probability,speech,probability distribution,bayesian methods,hidden markov model,spatial frequency,frequency
Speech processing,Pattern recognition,Visualization,Computer science,Speech recognition,Feature extraction,Artificial intelligence,Animation,Computer animation,Upsampling,Hidden Markov model,Dynamic Bayesian network
Conference
ISSN
ISBN
Citations 
1520-6149 E-ISBN : 978-1-4244-4296-6
978-1-4244-4296-6
2
PageRank 
References 
Authors
0.38
5
5
Name
Order
Citations
PageRank
Jiang Dongmei111515.28
Ravyse Ilse2436.24
Peizhen Liu330.73
Hichem Sahli447565.19
Werner Verhelst543151.55