Abstract | ||
---|---|---|
This paper summarizes our recent efforts for building a Turkish Broadcast News transcription and retrieval system. The agglutinative nature of Turkish leads to a high number of out-of-vocabulary (OOV) words which in turn lower automatic speech recognition (ASR) accuracy. This situation compromises the performance of speech retrieval systems based on ASR output. Therefore using a word-based ASR is not adequate for transcribing speech in Turkish. To alleviate this problem, various sub-word-based recognition units are utilized. These units solve the OOV problem with moderate size vocabularies and perform even better than a 500 K word vocabulary as far as recognition accuracy is concerned. As a novel approach, the interaction between recognition units, words and sub-words, and discriminative training is explored. Sub-word models benefit from discriminative training more than word models do, especially in the discriminative language modeling framework. For speech retrieval, a spoken term detection system based on automata indexation is utilized. As with transcription, retrieval performance is measured under various schemes incorporating words and sub-words. Best results are obtained using a cascade of word and sub-word indexes together with term-specific thresholding. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/TASL.2008.2012313 | IEEE Transactions on Audio, Speech & Language Processing |
Keywords | Field | DocType |
automatic speech recognition,recognition accuracy,retrieval system,retrieval performance,speech retrieval system,turkish broadcast news transcription,transcribing speech,speech retrieval,discriminative training,recognition unit,various sub-word-based recognition unit,broadcasting,language model,information retrieval,automata,natural language processing,statistical analysis,indexation,speech recognition,morphology,natural languages | Speech processing,Turkish,Computer science,Agglutinative language,Speech recognition,Natural language,Artificial intelligence,Natural language processing,Vocabulary,Discriminative model,Language model,Cable television | Journal |
Volume | Issue | ISSN |
17 | 5 | 1558-7916 |
Citations | PageRank | References |
43 | 1.74 | 38 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ebru Arisoy | 1 | 418 | 25.32 |
D. Can | 2 | 43 | 1.74 |
Siddika Parlak | 3 | 113 | 6.82 |
Hasim Sak | 4 | 690 | 39.56 |
M. Saraclar | 5 | 193 | 7.88 |