Abstract | ||
---|---|---|
Users often issue all kinds of queries to look for the same target due to the intrinsic ambiguity and flexibility of natural languages. Some previous work clusters queries based on co-clicks; however, the intents of queries in one cluster are not that similar but roughly related. It is desirable to conduct automatic mining of queries with equivalent intents from a large scale search logs. In this paper, we take account of similarities between query strings. There are two issues associated with such similarities: it is too costly to compare any pair of queries in large scale search logs, and two queries with a similar formulation, such as \"SVN\" (Apache Subversion) and support vector machine (SVM), are not necessarily similar in their intents. To address these issues, we propose using the similarities of query strings above the co-click based clustering results. Our method improves precision over the co-click based clustering method (lifting precision from 0.37 to 0.62), and outperforms a commercial search engine's query alteration (lifting $$F_1$$F1 measure from 0.42 to 0.56). As an application, we consider web document retrieval. We aggregate similar queries' click-throughs with the query's click-throughs and evaluate them on a large scale dataset. Experimental results indicate that our proposed method significantly outperforms the baseline method of using a query's own click-throughs in all metrics. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/s10791-016-9288-0 | Inf. Retr. Journal |
Keywords | Field | DocType |
Mining similar queries,Query intent,Web search | Web search query,Data mining,Query language,Query string,Query expansion,Information retrieval,Computer science,Web query classification,Queries per second,Spatial query,Cluster analysis | Journal |
Volume | Issue | ISSN |
19 | 6 | 1386-4564 |
Citations | PageRank | References |
1 | 0.37 | 34 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ruihua Song | 1 | 1138 | 59.33 |
Dingquan Wang | 2 | 11 | 2.51 |
Jian-yun Nie | 3 | 3681 | 238.61 |
Ji-Rong Wen | 4 | 4431 | 265.98 |
Yong Yu | 5 | 7637 | 380.66 |