Title
Integration of acoustic and articulatory information with application to speech recognition
Abstract
In speech recognition, fusion of multiple systems often results in improved recognition accuracy or robustness. All the previously suggested system fusions mainly focused on the recognition process. Training, on the other hand, are performed independently across different systems. In this paper, we investigated the combination of a Mel frequency cepstral coefficients (MFCC) based acoustic feature (ACF) system and an articulatory feature (AF) based system. In addition to proposing an asynchronous combination during the recognition process that makes the state combination more flexible during recognition, we proposed an efficient combination approach during the model training stage. We show that combining the models during training not only improved performance but also simplified fusion process during recognition. Because fusion during training removes inconsistency between the individual models, such as in state or phoneme alignments, it is particularly useful for highly constrained recognition fusion such as synchronous models combination. Comparing fusion of separately trained AF and ACF systems, fusion of jointly trained AF and ACF models resulted in more than 3% absolute phoneme recognition error reduction on the TIMIT corpus for synchronous and 1% for asynchronous combination.
Year
DOI
Venue
2004
10.1016/j.inffus.2003.10.007
Information Fusion
Keywords
Field
DocType
Speech recognition,Articulatory feature,Acoustic feature,Asynchronous combinations,Retraining parameters,System fusion,Joint training
Asynchronous communication,Mel-frequency cepstrum,TIMIT,Pattern recognition,Fusion,Robustness (computer science),Speech recognition,Artificial intelligence,Phoneme recognition,Mathematics
Journal
Volume
Issue
ISSN
5
2
1566-2535
Citations 
PageRank 
References 
3
0.37
16
Authors
2
Name
Order
Citations
PageRank
Ka-Yee Leung1142.85
Manhung Siu246461.40