Max-cover in map-reduce - Citegraph

Paper Info

Title
Max-cover in map-reduce

Abstract
The NP-hard Max-k-cover problem requires selecting k sets from a collection so as to maximize the size of the union. This classic problem occurs commonly in many settings in web search and advertising. For moderately-sized instances, a greedy algorithm gives an approximation of (1-1/e). However, the greedy algorithm requires updating scores of arbitrary elements after each step, and hence becomes intractable for large datasets. We give the first max cover algorithm designed for today's large-scale commodity clusters. Our algorithm has provably almost the same approximation as greedy, but runs much faster. Furthermore, it can be easily expressed in the MapReduce programming paradigm, and requires only polylogarithmically many passes over the data. Our experiments on five large problem instances show that our algorithm is practical and can achieve good speedups compared to the sequential greedy algorithm.

Year	DOI	Venue
2010	10.1145/1772690.1772715	WWW
Keywords	Field	DocType
classic problem,arbitrary element,mapreduce programming paradigm,k set,large datasets,maximum cover,greedy algorithm,np-hard max-k-cover problem,large problem instance,sequential greedy algorithm,good speedup,map-reduce,programming paradigm,algorithm design	Data mining,Programming paradigm,Computer science,Greedy algorithm,Artificial intelligence,Greedy randomized adaptive search procedure,Machine learning,Best-first search	Conference
Citations	PageRank	References
43	1.58	27
Authors
3

Authors (3 rows)

Cited by (43 rows)

References (27 rows)

Name	Order	Citations	PageRank
Flavio Chierichetti	1	626	39.42
Ravi Kumar	2	13932	1642.48
Andrew Tomkins	3	9388	1401.23

1