Abstract | ||
---|---|---|
The paper investigates the impact on query language modeling when using skip-grams within query as well as across queries in a given search session, in conjunction with the geo-annotation available for the query stream data. As modeling tool we use the recently proposed sparse non-negative matrix estimation technique, since it offers the same expressive power as the well-established maximum entropy approach in combining arbitrary context features. Experiments on the google.com query stream show that using session-level and geo-location context we can expect reductions in perplexity of 34% relative over the Kneser-Ney N-gram baseline; when evaluating on the '"local" subset of the query stream, the relative reduction in PPL is 51% - more than a bit. Both sources of context information (geo-location, and previous queries in session) are about equally valuable in building a language model for the query stream. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/ASRU.2015.7404767 | 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) |
Keywords | Field | DocType |
language modeling,geo-location,query session,sparse non-negative matrix,voice search | Query optimization,RDF query language,Query language,Information retrieval,Query expansion,Pattern recognition,Computer science,Sargable,Web query classification,Query by Example,Artificial intelligence,Boolean conjunctive query | Conference |
Citations | PageRank | References |
1 | 0.37 | 9 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ciprian Chelba | 1 | 1055 | 111.19 |
Noam Shazeer | 2 | 1089 | 43.70 |