Phonotactic Spoken Language Identification With Limited Training Data - Citegraph

Paper Info

Title
Phonotactic Spoken Language Identification With Limited Training Data

Abstract
We investigate the addition of a new language, for which limited resources are available, to a phonotactic language identification system. Two classes of approaches are studied: in the first class, only existing phonetic recognizers are employed, whereas an additional phonetic recognizer in the new language is created for the second class. It is found that the number of acoustic recognizers employed plays a crucial role in determining the recognition accuracy for the new language. We study different approaches to incorporating a language for which audio-only data is available (no pronunciation dictionaries or transcriptions) and find that if more than about 2 000 training utterances are available, a bootstrapped acoustic model for the new language can improve accuracy substantially.

Year	Venue	Keywords
2007	INTERSPEECH	spoken language identification, generalization, resource scarce languages
Field	DocType	Citations
Pronunciation,Transcription (linguistics),Phonotactics,Computer science,Bootstrapping,Speech recognition,First class,Natural language processing,Language identification,Artificial intelligence,Constructed language,Acoustic model	Conference	0
PageRank	References	Authors
0.34	8	3

Authors (3 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Marius Peche	1	0	0.34
Marelie H. Davel	2	236	22.70
Etienne Barnard	3	438	57.85

1