Abstract | ||
---|---|---|
We propose a simple but effective weighted finite state transducer (WFST) based framework for handling out-of-vocabulary (OOV) keywords in a speech search task. State-of-the-art large vocabulary continuous speech recognition (LVCSR) and keyword search (KWS) systems are developed for conversational telephone speech in Tagalog. Word-based and phone-based indexes are created from word lattices, the latter by using the LVCSR system's pronunciation lexicon. Pronunciations of OOV keywords are hypothesized via a standard grapheme-to-phoneme method. In-vocabulary proxies (word or phone sequences) are generated for each OOV keyword using WFST techniques that permit incorporation of a phone confusion matrix. Empirical results when searching for the Babel/NIST evaluation keywords in the Babel 10 hour development-test speech collection show that (i) searching for word proxies in the word index significantly outperforms searching for phonetic representations of OOV words in a phone index, and (ii) while phone confusion information yields minor improvement when searching a phone index, it yields up to 40% improvement in actual term weighted value when searching a word index with word proxies. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ASRU.2013.6707766 | ASRU |
Keywords | Field | DocType |
low resource lvcsr,term weighted value,standard grapheme-to-phoneme method,oov keyword search task,in-vocabulary proxies,keyword search,speech recognition,word-based index,oov keyword pronunciation,wfst-based framework,large-vocabulary continuous speech recognition,weighted finite state transducer,vocabulary,empirical analysis,nist evaluation keyword,word proxy search,word sequences,kws system,babel evaluation keyword,proxy keywords,phone-based index,conversational telephone speech,phone confusion matrix,tagalog,oov keywords,word lattices,speech search task,document handling,lvcsr system pronunciation lexicon,phone sequences,query processing,out-of-vocabulary keyword handling | Pronunciation,Confusion,Confusion matrix,Computer science,Keyword search,Speech recognition,Lexicon,NIST,Phone,Artificial intelligence,Natural language processing,Vocabulary | Conference |
Citations | PageRank | References |
36 | 1.42 | 12 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guoguo Chen | 1 | 428 | 19.89 |
Oguz Yilmaz | 2 | 52 | 2.61 |
Jan Trmal | 3 | 235 | 20.91 |
Daniel Povey | 4 | 2442 | 231.75 |
Sanjeev Khudanpur | 5 | 2155 | 202.00 |