Title
Sparse non-negative matrix language modeling for geo-annotated query session data
Abstract
The paper investigates the impact on query language modeling when using skip-grams within query as well as across queries in a given search session, in conjunction with the geo-annotation available for the query stream data. As modeling tool we use the recently proposed sparse non-negative matrix estimation technique, since it offers the same expressive power as the well-established maximum entropy approach in combining arbitrary context features. Experiments on the google.com query stream show that using session-level and geo-location context we can expect reductions in perplexity of 34% relative over the Kneser-Ney N-gram baseline; when evaluating on the '"local" subset of the query stream, the relative reduction in PPL is 51% - more than a bit. Both sources of context information (geo-location, and previous queries in session) are about equally valuable in building a language model for the query stream.
Year
DOI
Venue
2015
10.1109/ASRU.2015.7404767
2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
Keywords
Field
DocType
language modeling,geo-location,query session,sparse non-negative matrix,voice search
Query optimization,RDF query language,Query language,Information retrieval,Query expansion,Pattern recognition,Computer science,Sargable,Web query classification,Query by Example,Artificial intelligence,Boolean conjunctive query
Conference
Citations 
PageRank 
References 
1
0.37
9
Authors
2
Name
Order
Citations
PageRank
Ciprian Chelba11055111.19
Noam Shazeer2108943.70