Title
Improving Automatic Speech Recognition by Classifying Adult and Child Speakers into Separate Groups using Speech Rate Rhythmicity Parameter
Abstract
When children's speech is transcribed using acoustic models trained on adults' data, a severely degraded recognition performance is obtained. Similar degradations are noted on recognizing adults' speech using an automatic speech recognition (ASR) system trained on children's speech. This problem can be overcome by using two separate ASR systems for the two groups of speakers. But this approach requires an effective technique to detect whether the given data is from adult or child speaker. In this paper, we present a very simple and novel technique to do the same. The proposed approach is based on speechrate rhythmicity parameter (SRRP). Since the speaking-rates for adults and children differ significantly, the SRRP values are also very different for the two groups of speakers. Hence, by computing the SRRP value for a given speech utterance, it can be easily determined whether it is from adult or child speaker. The corresponding ASR systems can then be used to achieve improved recognition performance. Alternatively, existing techniques for improving children's speech recognition on adult data trained systems can be directly applied once it is known that the data is from a child speaker. Both these aspects have been experimentally validated in this work.
Year
DOI
Venue
2020
10.1109/SPCOM50965.2020.9179497
2020 International Conference on Signal Processing and Communications (SPCOM)
Keywords
DocType
ISSN
Speech recognition,children’s speech recognition,speaking-rate,speech-rate rhythmicity parameter.
Conference
2474-9168
ISBN
Citations 
PageRank 
978-1-7281-8896-6
0
0.34
References 
Authors
10
3
Name
Order
Citations
PageRank
S. Shahnawazuddin16417.34
Tarun Sai Bandarupalli200.34
R. Chakravarthy300.34