How Many Workers to Ask? Adaptive Exploration for Collecting High Quality Labels - Citegraph

Paper Info

Title
How Many Workers to Ask? Adaptive Exploration for Collecting High Quality Labels

Abstract
Crowdsourcing has been part of the IR toolbox as a cheap and fast mechanism to obtain labels for system development and evaluation. Successful deployment of crowdsourcing at scale involves adjusting many variables, a very important one being the number of workers needed per human intelligence task (HIT). We consider the crowdsourcing task of learning the answer to simple multiple-choice HITs, which are representative of many relevance experiments. In order to provide statistically significant results, one often needs to ask multiple workers to answer the same HIT. A stopping rule is an algorithm that, given a HIT, decides for any given set of worker answers to stop and output an answer or iterate and ask one more worker. In contrast to other solutions that try to estimate worker performance and answer at the same time, our approach assumes the historical performance of a worker is known and tries to estimate the HIT difficulty and answer at the same time. The difficulty of the HIT decides how much weight to give to each worker's answer. In this paper we investigate how to devise better stopping rules given workers' performance quality scores. We suggest adaptive exploration as a promising approach for scalable and automatic creation of ground truth. We conduct a data analysis on an industrial crowdsourcing platform, and use the observations from this analysis to design new stopping rules that use the workers' quality scores in a non-trivial manner. We then perform a number of experiments using real-world datasets and simulated data, showing that our algorithm performs better than other approaches.

Year	DOI	Venue
2016	10.1145/2911451.2911514	Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
Keywords	Field	DocType
Crowdsourcing,label quality,ground truth,assessments,adaptive algorithms,multi-armed bandits	Data mining,Software deployment,Computer science,Crowdsourcing,Toolbox,Artificial intelligence,Ask price,Information retrieval,Human intelligence,Ground truth,Stopping rule,Machine learning,Scalability	Conference
ISBN	Citations	PageRank
978-1-4503-4069-4	5	0.39
References	Authors
15	6

Authors (6 rows)

Cited by (5 rows)

References (15 rows)

Name	Order	Citations	PageRank
Ittai Abraham	1	1483	89.62
Omar Alonso	2	855	65.44
Vasilis Kandylas	3	29	2.92
Rajesh Patel	4	6	1.07
Steven Shelford	5	6	0.74
Aleksandrs Slivkins	6	1133	75.56

1