Title
Reducing computational and memory cost for cellular phone embedded speech recognition system
Abstract
Thispaperisfocusedoncellularphoneembeddedspeechrecog- nition. We present several methods able to fit speech recognition system requirements to cellular phone resource. The proposed techniques are evaluated on a digit recognition task using both French and English corpora. We investigate particularly three as- pects of speech processing: acoustic parameterization, recognition algorithms and acoustic modeling. Several parameterization algorithms (LPCC, MFCC and PLP) are compared to the Linear Predictive Coding (LPC) included in the GSM norm. The MFCC and PLP parameterization algorithms perform significantly better than the other ones. Moreover, feature vector size can be reduced until 6 PLP coefficients allowing to decrease memory and computation resources without a significant loss of performance. Inordertoachieve goodperformance withreasonable resource needs, we developseveral methodstoembedclassical HMM-based speech recognition system in cellular phone. We first propose an automatic on-line building of phonetic lexicon which allows a min- imal but unlimited lexicon. Then we reduce the HMM model com- plexity by decreasing the number of (Gaussian) components per state. Finally, we evaluate our propositions by comparing Dynamic Time Warping (DTW) with our HMM system - in the context of cellular phone - for clean conditions. The experiments show that our HMM system outperforms DTW for speaker independent task and allows more practical applications for the cellular-phone user interface.
Year
DOI
Venue
2004
10.1109/ICASSP.2004.1327109
ICASSP '04). IEEE International Conference
Keywords
Field
DocType
cellular radio,embedded systems,hidden Markov models,linear predictive coding,mobile handsets,speech coding,speech recognition,English corpora,French corpora,GSM,HMM,LPC,MFCC,Mel frequency cepstral coefficients,acoustic modeling,acoustic parameterization,cellular phone embedded speech recognition system,computational cost reduction,digit recognition task,dynamic time warping,linear predictive cepstrum coefficients,linear predictive coding,memory cost reduction,perceptual linear predictive coefficients,phonetic lexicon,speaker independent tasks,user interface
Speech processing,Mel-frequency cepstrum,GSM,Feature vector,Speech coding,Pattern recognition,Dynamic time warping,Computer science,Speech recognition,Artificial intelligence,Hidden Markov model,Linear predictive coding
Conference
Volume
ISSN
ISBN
5
1520-6149
0-7803-8484-9
Citations 
PageRank 
References 
5
0.65
4
Authors
4
Name
Order
Citations
PageRank
Christophe Levy11215.00
georges linar es213629.55
Pascal Nocera37010.86
Jean-François Bonastre46410.60