Title
A Concise Integer Linear Programming Formulation for Implicit Search Result Diversification.
Abstract
To cope with ambiguous and/or underspecified queries, search result diversification (SRD) is a key technique that has attracted a lot of attention. This paper focuses on implicit SRD, where the possible subtopics underlying a query are unknown beforehand. We formulate implicit SRD as a process of selecting and ranking k exemplar documents that utilizes integer linear programming (ILP). Unlike the common practice of relying on approximate methods, this formulation enables us to obtain the optimal solution of the objective function. Based on four benchmark collections, our extensive empirical experiments reveal that: (1) The factors, such as different initial runs, the number of input documents, query types and the ways of computing document similarity significantly affect the performance of diversification models. Careful examinations of these factors are highly recommended in the development of implicit SRD methods. (2) The proposed method can achieve substantially improved performance over the state-of-the-art unsupervised methods for implicit SRD.
Year
DOI
Venue
2017
10.1145/3018661.3018710
WSDM
Keywords
Field
DocType
Cluster-based IR, implicit SRD, integer linear programming
Data mining,Ranking,Computer science,Integer linear programming formulation,Integer programming,Diversification (marketing strategy),Artificial intelligence,Document similarity,Machine learning
Conference
Citations 
PageRank 
References 
5
0.40
36
Authors
7
Name
Order
Citations
PageRank
Haitao Yu1214.00
Adam Jatowt2903106.73
Roi Blanco387257.42
Hideo Joho488170.47
Joemon M. Jose52782198.37
Long Chen6886.15
Fajie Yuan714314.55