Title
Adapting language models for frequent fixed phrases by emphasizing n-gram subsets
Abstract
In support of speech-driven question answering, we propose a method to construct N-gram language models for recognizing spoken questions with high accuracy. Question-answering sys- tems receive queries that often consist of two parts: one conveys the query topic and the other is a fixed phrase used in query sentences. A language model constructed by using a target col- lection of QA, for example, newspaper articles, can model the former part, but cannot model the latter part appropriately. We tackle this problem as task adaptation from language models ob- tained from background corpora (e.g., newspaper articles) to the fixed phrases, and propose a method that does not use the task- specific corpus, which is often difficult to obtain, but instead uses only manually listed fixed phrases. The method empha- sizes a subset of N-grams obtained from a background corpus that corresponds to fixed phrases specified by the list. Theoret- ically, this method can be regarded as maximizing a posteriori probability (MAP) estimation using the subset of the N-grams as a posteriori distribution. Some experiments show the effec- tiveness of our method.
Year
Venue
Keywords
2003
INTERSPEECH
language model,col,question answering
Field
DocType
Citations 
Computer science,A priori and a posteriori,Phrase,Task adaptation,Newspaper,A posteriori probability,Natural language processing,n-gram,Artificial intelligence,Language model,Question answering,Pattern recognition,Speech recognition
Conference
3
PageRank 
References 
Authors
0.50
11
3
Name
Order
Citations
PageRank
Tomoyosi Akiba117629.08
Katunobu Itou231944.36
Atsushi Fujii348659.25