Title
Picture My Voice: Audio to Visual Speech Synthesis using Artificial Neural Networks
Abstract
This paper presents an initial implementation and evaluation of a system that synthesizes visual speech directly from the acoustic waveform. An artifical neural network (ANN) was trained to map the cepstral coefficients of an individual's natural speech to the control parameters of an animated synthetic talking head. We trained on two data sets; one was a set of 400 words spoken in isolation by a single speaker and the other a subset of extemporaneous speech from 10 different speakers. The system showed learning in both cases. A perceptual evaluation test indicated that the system's generalization to new words by the same speaker provides significant visible information, but significantly below that given by a text-to-speech algorithm.
Year
Venue
Keywords
1999
AVSP
speech synthesis,neural network,artificial neural network,text to speech
Field
DocType
Citations 
Motion capture,Speech synthesis,Gesture,Computer science,Communication channel,Speech recognition,Coarticulation,Animation,Artificial neural network,Perception
Conference
40
PageRank 
References 
Authors
1.90
8
5
Name
Order
Citations
PageRank
Dominic W. Massaro139149.07
Jonas Beskow266896.64
Michael M. Cohen326834.79
Christopher L. Fry4402.23
Tony Rodriguez5412.29