Title
Continuous vocal imitation with self-organized vowel spaces in recurrent neural network
Abstract
A continuous vocal imitation system was developed using a computational model that explains the process of phoneme acquisition by infants. Human infants perceive speech sounds not as discrete phoneme sequences but as continuous acoustic signals. One of critical problems in phoneme acquisition is the design for segmenting these continuous speech sounds. The key idea to solve this problem is that articulatory mechanisms such as the vocal tract help human beings to perceive speech sound units corresponding to phonemes. To segment acoustic signal with articulatory movement, we apply the segmenting method to our system by Recurrent Neural Network with Parametric Bias (RNNPB). This method determines the multiple segmentation boundaries in a temporal sequence using the prediction error of the RNNPB model, and the PB values obtained by the method can be encoded as kind of phonemes. Our system was implemented by using a physical vocal tract model, called the Maeda model. Experimental results demonstrated that our system can self-organize the same phonemes in different continuous sounds, and can imitate vocal sound involving arbitrary numbers of vowels using the vowel space in the RNNPB. This suggests that our model reflects the process of phoneme acquisition.
Year
DOI
Venue
2009
10.1109/ROBOT.2009.5152818
ICRA
Keywords
Field
DocType
discrete phoneme sequence,continuous acoustic signal,different continuous sound,maeda model,rnnpb model,continuous vocal imitation system,phoneme acquisition,self-organized vowel space,recurrent neural network,physical vocal tract model,continuous speech,computational model,neuroscience,predictive models,self organization,computational modeling,computer networks,speech recognition,speech,natural languages,acoustics,recurrent neural networks,computer model,vocal tract,prediction error,speech processing,pediatrics,silicon
Speech processing,Mean squared prediction error,Computer science,Segmentation,Recurrent neural network,Speech recognition,Parametric statistics,Imitation,Vowel,Vocal tract
Conference
Volume
Issue
ISSN
2009
1
1050-4729
Citations 
PageRank 
References 
7
0.71
7
Authors
5
Name
Order
Citations
PageRank
Hisashi Kanda1142.25
Tetsuya Ogata21158135.73
toru takahashi333739.39
Kazunori Komatani479087.95
Hiroshi G. Okuno52092233.19