Abstract | ||
---|---|---|
The spoken term detection (STD) task aims to return relevant segments from a spoken archive that contain the query terms whether or not they are in the system vocabulary. This paper focuses on pronunciation modeling for Out-of-Vocabulary (OOV) terms which frequently occur in STD queries. The STD system described in this paper indexes word-level and sub-word level lattices or confusion networks produced by an LVCSR system using Weighted Finite State Transducers (WFST).We investigate the inclusion of n-best pronunciation variants for OOV terms (obtained from letter-to-sound rules) into the search and present the results obtained by indexing confusion networks as well as lattices. The following observations are worth mentioning: phone indexes generated from sub-words represent OOVs well and too many variants for the OOV terms degrade performance if pronunciations are not weighted. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/ICASSP.2009.4960494 | ICASSP |
Keywords | Field | DocType |
speech recognition | Pronunciation,Confusion,Pattern recognition,Computer science,Search engine indexing,Speech recognition,Finite state,NIST,Artificial intelligence,Natural language processing,Decoding methods,Vocabulary | Conference |
ISSN | Citations | PageRank |
1520-6149 | 30 | 1.59 |
References | Authors | |
15 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dogan Can | 1 | 128 | 10.64 |
Erica Cooper | 2 | 51 | 4.19 |
Abhinav Sethy | 3 | 363 | 31.16 |
Chris White | 4 | 30 | 1.59 |
Bhuvana Ramabhadran | 5 | 1779 | 153.83 |
Murat Saraclar | 6 | 669 | 62.91 |