Title
How to aggregate Top-lists - Approximation algorithms via scores and average ranks.
Abstract
A top-list is a possibly incomplete ranking of elements: only a subset of the elements are ranked, with all unranked elements tied for last. Top-list aggregation, a generalization of the well-known rank aggregation problem, takes as input a collection of top-lists and aggregates them into a single complete ranking, aiming to minimize the number of upsets (pairs ranked in opposite order in the input and in the output). In this paper, we give simple approximation algorithms for top-list aggregation. • We generalize the footrule algorithm for rank aggregation (which minimizes Spearman's footrule distance), yielding a simple 2-approximation algorithm for toplist aggregation. • Ailon's RepeatChoice algorithm for bucket-orders aggregation yields a 2-approximation algorithm for toplist aggregation. Using inspiration from approval voting, we define the score of an element as the frequency with which it is ranked, i.e. appears in an input top-list. We reinterpret RepeatChoice for top-list aggregation as a randomized algorithm using variables whose expectations correspond to score and to the average rank of an element given that it is ranked. • Using average ranks, we generalize and analyze Borda's algorithm for rank aggregation. We observe that the natural generalization is not a constant approximation. • We design a simple 2-phase variant of the Generalized Borda's algorithm, roughly sorting by scores and breaking ties by average ranks, yielding another simple constant-approximation algorithm for top-list aggregation. • We then design another 2-phase variant in which in order to break ties we use, as a black box, the Mathieu-Schudy PTAS for rank aggregation, yielding a PTAS for top-list aggregation. This solves an open problem posed by Ailon. • Finally, in the special case in which all input lists have length at most k, we design another simple 2-phase algorithm based on sorting by scores, and prove that it is an EPTAS - the complexity is O(n log n) when k = o(log n).
Year
DOI
Venue
2020
10.5555/3381089.3381260
SODA '20: ACM-SIAM Symposium on Discrete Algorithms Salt Lake City Utah January, 2020
Field
DocType
Citations 
Approximation algorithm,Discrete mathematics,Computer science
Conference
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Claire Mathieu162.78
Simon Mauras200.34