Title
Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems
Abstract
Many aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new time-sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods.
Year
DOI
Venue
2015
10.1613/jair.5175
J. Artif. Intell. Res. (JAIR)
Field
DocType
Volume
Entity linking,Bayesian inference,Confusion matrix,Crowdsourcing,Computer science,Posterior probability,Latent variable,Artificial intelligence,Information aggregation,Machine learning,Bayesian probability
Journal
abs/1510.06335
Issue
ISSN
Citations 
1
1076-9757
6
PageRank 
References 
Authors
0.52
27
4
Name
Order
Citations
PageRank
Matteo Venanzi125116.27
John Guiver248221.48
Pushmeet Kohli37398332.84
Nicholas R. Jennings4193481564.35