Title
Unsupervised query segmentation using only query logs
Abstract
We introduce an unsupervised query segmentation scheme that uses query logs as the only resource and can effectively capture the structural units in queries. We believe that Web search queries have a unique syntactic structure which is distinct from that of English or a bag-of-words model. The segments discovered by our scheme help understand this underlying grammatical structure. We apply a statistical model based on Hoeffding's Inequality to mine significant word n-grams from queries and subsequently use them for segmenting the queries. Evaluation against manually segmented queries shows that this technique can detect rare units that are missed by our Pointwise Mutual Information (PMI) baseline.
Year
DOI
Venue
2011
10.1145/1963192.1963239
WWW (Companion Volume)
Keywords
Field
DocType
web search query,unsupervised query segmentation,scheme help,unsupervised query segmentation scheme,segmented query,statistical model,underlying grammatical structure,query log,pointwise mutual information,unique syntactic structure,bag-of-words model,bag of words,hoeffding s inequality
Query optimization,Web search query,Data mining,Query language,Query expansion,Information retrieval,Computer science,Sargable,Web query classification,Spatial query,Online aggregation
Conference
Citations 
PageRank 
References 
22
0.82
4
Authors
5
Name
Order
Citations
PageRank
Nikita Mishra1220.82
Rishiraj Saha Roy211215.17
Niloy Ganguly31306121.03
Srivatsan Laxman442121.65
Monojit Choudhury534648.32