Title
Detecting Linked Data quality issues via crowdsourcing: A DBpedia study.
Abstract
In this paper we examine the use of crowdsourcing as a means to detect Linked Data quality problems that are difficult to uncover automatically. We base our approach on the analysis of the most common errors encountered in the DBpedia dataset, and a classification of these errors according to the extent to which they are likely to be amenable to crowdsourcing. We then propose and study different crowdsourcing approaches to identify these Linked Data quality issues, employing DBpedia as our use case: (i) a contest targeting the Linked Data expert community, and (ii) paid microtasks published on Amazon Mechanical Turk. We secondly focus on adapting the Find-Fix-Verify crowdsourcing pattern to exploit the strengths of experts and lay workers. By testing two distinct Find-Verify workflows (lay users only and experts verified by lay users) we reveal how to best combine different crowds' complementary aptitudes in Linked Data quality issue detection. Empirical results show that a combination of the two styles of crowdsourcing is likely to achieve more effective results than each of them used in isolation, and that human computation is a promising and affordable way to enhance the quality of DBpedia.
Year
DOI
Venue
2018
10.3233/SW-160239
SEMANTIC WEB
Keywords
Field
DocType
Quality assessment,quality issues,Linked Data,crowdsourcing,microtasks,experts
Data science,Crowds,Crowdsourcing,CONTEST,Linked data,Exploit,Philosophy,Human computation,Workflow,Linguistics
Journal
Volume
Issue
ISSN
9
3
1570-0844
Citations 
PageRank 
References 
4
0.41
43
Authors
6
Name
Order
Citations
PageRank
Maribel Acosta118920.81
Amrapali Zaveri236824.37
Elena Simperl31069122.60
Dimitris Kontokostas449031.79
Fabian Flöck5446.64
Jens Lehmann65375355.08