Title
Query log driven web search results clustering
Abstract
Different important studies in Web search results clustering have recently shown increasing performances motivated by the use of external resources. Following this trend, we present a new algorithm called Dual C-Means, which provides a theoretical background for clustering in different representation spaces. Its originality relies on the fact that external resources can drive the clustering process as well as the labeling task in a single step. To validate our hypotheses, a series of experiments are conducted over different standard datasets and in particular over a new dataset built from the TREC Web Track 2012 to take into account query logs information. The comprehensive empirical evaluation of the proposed approach demonstrates its significant advantages over traditional clustering and labeling techniques.
Year
DOI
Venue
2014
10.1145/2600428.2609583
SIGIR
Keywords
Field
DocType
web search results clustering,automatic labeling,dual c-means,evaluation,clustering
Data mining,Fuzzy clustering,CURE data clustering algorithm,Computer science,Artificial intelligence,Cluster analysis,Canopy clustering algorithm,Clustering high-dimensional data,Data stream clustering,Correlation clustering,Information retrieval,Brown clustering,Machine learning
Conference
Citations 
PageRank 
References 
7
0.47
28
Authors
3
Name
Order
Citations
PageRank
Jose G. Moreno15010.67
Gaël Dias235441.95
Guillaume Cleuziou312919.02