Title
Chhattisgarhi speech corpus for research and development in automatic speech recognition.
Abstract
Automatic speech recognition (ASR) is a computerized interface which allows humans to communicate with machine in a way of its natural conversation. ASR has wide range of applications in various fields such as language development in young children, telecommunications, as an assistive device for hearing impaired etc. Performance of ASR system is greatly influenced by the database used for its implementation. In this paper, we are discussing about building a speech corpus for a rare but important Indian dialect Chhattisgarhi. This speech corpus consists of 100 unique isolated words and four speech scripts aggregating 67 sentences, recorded from total 478 native speakers. These words were selected from English to Chhattisgarhi dictionary published by Chhattisgarh Rajbhasha Aayog and scripts from Chhattisgarhi literature and newspaper articles. This dataset has been collected travelling over 60% geographical area of the Chhattisgarh state. Finally, a valuable speech corpus for the first time have been prepared for Chhattisgarhi with an aim to enhance the speech research. The successful extermination of speech recognition for both isolated and continuous speech samples have been demonstrated on the prepared database.
Year
DOI
Venue
2018
10.1007/s10772-018-9496-7
I. J. Speech Technology
Keywords
Field
DocType
Chhattisgarhi, Speech corpus, Automatic speech recognition, Mel-frequency cepstral coefficients
Speech corpus,Mel-frequency cepstrum,Conversation,Computer science,Speech recognition,Newspaper,Language development,Scripting language
Journal
Volume
Issue
ISSN
21
2
1381-2416
Citations 
PageRank 
References 
1
0.36
9
Authors
2
Name
Order
Citations
PageRank
Narendra D. Londhe19813.85
Ghanahshyam B. Kshirsagar261.42