Title
Towards automatic transcription of large spoken archives in agglutinating languages - Hungarian ASR for the MALACH project
Abstract
The paper describes automatic speech recognition experiments and results on the spontaneous Hungarian MALACH speech corpus. A novel morph-based lexical modeling approach is compared to the traditional word-based one and to another, previously best performing morph-based one in terms of word and letter error rates. The applied language and acoustic modeling techniques are also detailed. Using unsupervised speaker adaptations along with morph based lexical models 14.4%-8.1% absolute word error rate reductions have been achieved on a 2 speakers, 2 hours test set as compared to the speaker independent baseline results.
Year
Venue
Keywords
2007
TSD
agglutinating language,towards automatic transcription,lexical model,absolute word error rate,speaker independent baseline result,acoustic modeling technique,letter error rate,hungarian asr,applied language,lexical modeling approach,unsupervised speaker adaptation,automatic speech recognition experiment,malach project,spontaneous hungarian malach speech,word error rate,agglutinative languages,automatic speech recognition,error rate
Field
DocType
Volume
Speech corpus,Computer science,Word error rate,Speech recognition,Speaker recognition,Artificial intelligence,Speaker diarisation,Natural language processing,Language model,Acoustic model,Test set
Conference
4629
ISSN
ISBN
Citations 
0302-9743
3-540-74627-7
3
PageRank 
References 
Authors
0.46
11
5
Name
Order
Citations
PageRank
Péter Mihajlik15810.15
Tibor Fegyó26110.46
Bottyán Németh331240.04
Zoltán Tüske411917.32
Viktor Trón5283.90