Asr Corpus Design For Resource-Scarce Languages - Citegraph

Paper Info

Title
Asr Corpus Design For Resource-Scarce Languages

Abstract
We investigate the number of speakers and the amount of data that is required for the development of useable speaker-independent speech-recognition systems in resource-scarce languages. Our experiments employ the Lwazi corpus, which contains speech in the eleven official languages of South Africa. We find that a surprisingly small number of speakers (fewer than 50) and around 10 to 20 hours of speech per language are sufficient for the purposes of acceptable phone-based recognition.

Year	Venue	Keywords
2009	INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5	speech recognition, corpus design
Field	DocType	Citations
Speech corpus,Computer science,Speech recognition,Phone,Natural language processing,Artificial intelligence,VoxForge	Conference	24
PageRank	References	Authors
1.75	4	3

Authors (3 rows)

Cited by (24 rows)

References (4 rows)

Name	Order	Citations	PageRank
Etienne Barnard	1	438	57.85
Marelie H. Davel	2	236	22.70
Charl Johannes van Heerden	3	133	12.50

1