Abstract | ||
---|---|---|
Most speech recognition systems try to reconstruct a word sequence given an acoustic input, using prior information about the language being spoken. In some cases, there is more information available to the decoder than simply the acoustics. When decoding a television news broadcast, for example, the closed-caption information that is often recorded for hearing impaired viewers may also be available. While these captions are generally not completely accurate transcriptions, they can be considered to be a strong hint as to what was actually spoken.In this paper, we present a formalization of this problem in terms of the source channel paradigm. We propose a simple translation model for mapping caption sequences to word sequences which updates the language model with the prior information inherent in the captions. We also describe an efficient implementation of the search in a Viterbi decoder, and present results using this system in the broadcast news domain. |
Year | DOI | Venue |
---|---|---|
1996 | 10.1109/ICSLP.1996.607220 | ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 |
Keywords | Field | DocType |
natural languages,language model,acoustic noise,viterbi decoding,decoding,sequences,viterbi decoder,computer science,decoder,indexing,workstations,radio broadcasting,television broadcasting,speech recognition | Broadcasting,Transcription (linguistics),Computer science,Communication channel,Speech recognition,Natural language,Viterbi decoder,Artificial intelligence,Natural language processing,Cheating,Decoding methods,Language model | Conference |
Citations | PageRank | References |
13 | 3.67 | 5 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Paul Placeway | 1 | 115 | 44.97 |
John D. Lafferty | 2 | 14904 | 1772.53 |