Title
Developing speaker independent ASR system using limited data through prosody modification based on fuzzy classification of spectral bins.
Abstract
The primary motive of this study is to develop an automatic speech recognition (ASR) system using limited amount of speech data such that it is least affected by speaker-dependent acoustic variations. The two factors contributing towards inter-speaker variability that are focused upon in this work are pitch and speaking-rate variations. In order to simulate such a limited data scenario, an ASR system is trained on adults' speech and tested using speech data from adult as well as child speakers. Compared to adults' speech test case, the recognition rates are noted to be extremely degraded when the test speech is from child speakers. The observed degradation is due to large differences in pitch and speaking-rate between adults' and children's speech along with other factors leading to inter-speaker acoustic variations. To overcome the mismatch in pitch and speaking-rate, two different approaches are proposed in this paper. In the first approach, the pitch and speaking-rate of children's speech test set are explicitly modified using a recently proposed prosody modification technique that exploits fuzzy classification of spectral bins. In the second approach, pitch and speaking-rate of the training data are modified to create newer versions of the data. In order to capture greater acoustic variability, the original and the modified versions are then pooled together. The ASR system trained on augmented data is noted to be more robust towards pitch and speaking-rate variations. Consequently, relative improvements of 17% and 31% over the baseline are obtained on decoding adults' and children's speech test sets, respectively.
Year
DOI
Venue
2019
10.1016/j.dsp.2019.06.015
Digital Signal Processing
Keywords
Field
DocType
Speaker-independent ASR,Children's speech recognition,Prosody modification,Fuzzy classification,Data augmentation
Training set,Prosody,Pattern recognition,Fuzzy classification,Artificial intelligence,Decoding methods,Mathematics,Test set
Journal
Volume
ISSN
Citations 
93
1051-2004
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
S. Shahnawazuddin16417.34
Adiga, N.2103.60
B. Tarun Sai300.34
Waquar Ahmad485.90
Hemant K. Kathania502.37