Title
Web-based topic language modeling for audio indexing
Abstract
We describe the implementation of a scalable architecture for audio indexing, in which topic-dependent language models (LMs) were trained on web pages categorized in a portal web directory and stored on distributed servers. Input speech was decoded in parallel on servers that each had an individual topic LM. From the decoders' outputs, an optimal hypothesis was chosen for each utterance by a topic-selection criterion minimizing an energy function with three terms: likelihood scores for the utterances; keyword co-occurrence statistics to measure the long-distance correlation; and web-based hypothesis verification scores, which penalize misrecognized trigrams through web search results. Experimental results showed that the proposed approach outperformed the baseline topic-independent system by 6.0% absolutely (20.0% relatively) in character accuracy.
Year
DOI
Venue
2009
10.1109/ICME.2009.5202622
ICME
Keywords
Field
DocType
web search result,portal web directory,energy function,baseline topic-independent system,web page,character accuracy,web-based hypothesis verification score,web-based topic language modeling,audio indexing,optimal hypothesis,servers,language model,natural language processing,speech processing,decoding,speech,simulated annealing,information analysis,internet,web pages
Speech processing,Information retrieval,Web page,Computer science,Trigram,Server,Search engine indexing,Natural language processing,Artificial intelligence,Web application,Web directory,Language model
Conference
ISSN
Citations 
PageRank 
1945-7871
1
0.38
References 
Authors
7
1
Name
Order
Citations
PageRank
Ken-ichi Iso1355.35