Title
Mining Audio/Visual Database For Speech Driven Face Animation
Abstract
In this paper, we present a data-mining framework in audio-visual interaction, and apply it to speech driven lip motion facial animation system. First, an unsupervised cluster algorithm is proposed to build a set of clusters in which each has similar configurations. Then statistical visual model is constructed by specifying all the possible cluster trajectories. The audio is analyzed with regard to learned clusters of facial gesture. For every cluster, two neural networks are trained to build mapping from audio features to cluster label and velocity respectively. Given a new vocal track, the statistical visual model and neural networks are combined together to analyze control audio, resulting in a most Likely facial state sequence. The proposed method not only automatically incorporates vocal and facial dynamics such as co-articulation, but also is characterized with easy training, more robust, extensible and interpretable. Two approaches for evaluation test are also proposed. The performance of our system shows that the proposed learning algorithm is suitable, which greatly improves the realism of face animation during speech.
Year
DOI
Venue
2001
null
2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE
Keywords
Field
DocType
data mining, facial animation, lip-syncing
Computer science,Gesture,Audio mining,Speech recognition,Animation,Computer facial animation,Artificial neural network,Computer animation,Audio signal processing,Cluster analysis
Conference
Volume
Issue
ISSN
4
null
1062-922X
Citations 
PageRank 
References 
1
0.37
8
Authors
5
Name
Order
Citations
PageRank
Yiqiang Chen11446109.32
Wen Gao211374741.77
Zhaoqi Wang322533.91
Jun Miao422022.17
Dalong Jiang520310.26