Human-powered sorts and joins - Citegraph

Paper Info

Title
Human-powered sorts and joins

Abstract
Crowdsourcing markets like Amazon's Mechanical Turk (MTurk) make it possible to task people with small jobs, such as labeling images or looking up phone numbers, via a programmatic interface. MTurk tasks for processing datasets with humans are currently designed with significant reimplementation of common workflows and ad-hoc selection of parameters such as price to pay per task. We describe how we have integrated crowds into a declarative workflow engine called Qurk to reduce the burden on workflow designers. In this paper, we focus on how to use humans to compare items for sorting and joining data, two of the most common operations in DBMSs. We describe our basic query interface and the user interface of the tasks we post to MTurk. We also propose a number of optimizations, including task batching, replacing pairwise comparisons with numerical ratings, and pre-filtering tables before joining them, which dramatically reduce the overall cost of running sorts and joins on the crowd. In an experiment joining two sets of images, we reduce the overall cost from $67 in a naive implementation to about $3, without substantially affecting accuracy or latency. In an end-to-end experiment, we reduced cost by a factor of 14.5.

Year	DOI	Venue
2011	10.14778/2047485.2047487	PVLDB
Keywords	DocType	Volume
task people,basic query interface,user interface,mturk task,task batching,overall cost,common workflows,programmatic interface,human-powered sort,declarative workflow engine,common operation	Journal	5
Issue	ISSN	Citations
1	Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 1, pp. 13-24 (2011)	145
PageRank	References	Authors
5.09	8	5

Search Limit

100145

Authors (5 rows)

Cited by (100 rows)

References (8 rows)

Name	Order	Citations	PageRank
Adam Marcus	1	145	5.09
Eugene Wu	2	691	45.52
David R. Karger	3	19367	2233.64
Samuel Madden	4	16101	1176.38
Robert C. Miller	5	4412	326.00

1