Realistic mouth animation based on an articulatory DBN model with constrained asynchrony - Citegraph

Paper Info

Title
Realistic mouth animation based on an articulatory DBN model with constrained asynchrony

Abstract
In this paper, we propose an approach to convert acoustic speech to video realistic mouth animation based on an articulatory dynamic Bayesian network model with constrained asynchrony (AF_AVDBN). Conditional probability distributions are defined to control the asynchronies between the articulators such as lips, tongue and glottis/velum. An EM-based conversion algorithm is also presented to learn the optimal visual features given an auditory input and the trained AF_AVDBN parameters. In the training of the AF_AVDBN models, downsampled YUV spatial frequency features of the interpolated mouth image sequences are extracted as visual features. For reproducing the mouth animation sequence, from the learned visual features, a spatial upsampling and a temporal downsampling are applied. Both qualitative and quantitative results show that the proposed method is capable of producing more natural and realistic mouth animations, and the accuracy is further improved compared to the state of the art multi-stream Hidden Markov Model (MSHMM) and articulatory DBN model without asynchrony constraint (AF_DBN).

Year	DOI	Venue
2010	10.1109/ICASSP.2010.5494894	ICASSP
Keywords	Field	DocType
constrained asynchrony,statistical distributions,belief networks,speech processing,em-based conversion algorithm,optimal visual features,multistream hidden markov model,interpolation,hearing,computer animation,asynchrony,articulatory dynamic bayesian network model,af-dbn,interpolated mouth image sequences,af_avdbn,af-avdbn,feature extraction,acoustic speech,image sequences,constraint handling,spatial upsampling,video realistic mouth animation,probability distributions,af_dbn,temporal downsampling,mouth animation,conditional probability distribution,auditory input,hidden markov models,downsampled yuv spatial frequency,articulatory dbn model,dynamic bayesian network,visualization,animation,conditional probability,speech,probability distribution,bayesian methods,hidden markov model,spatial frequency,frequency	Speech processing,Pattern recognition,Visualization,Computer science,Speech recognition,Feature extraction,Artificial intelligence,Animation,Computer animation,Upsampling,Hidden Markov model,Dynamic Bayesian network	Conference
ISSN	ISBN	Citations
1520-6149 E-ISBN : 978-1-4244-4296-6	978-1-4244-4296-6	2
PageRank	References	Authors
0.38	5	5

Authors (5 rows)

Cited by (2 rows)

References (5 rows)

Name	Order	Citations	PageRank
Jiang Dongmei	1	115	15.28
Ravyse Ilse	2	43	6.24
Peizhen Liu	3	3	0.73
Hichem Sahli	4	475	65.19
Werner Verhelst	5	431	51.55

1