Abstract | ||
---|---|---|
In this paper, we explore how to efficiently combine crowdsourcing and machine intelligence for the problem of document screening, where we need to screen documents with a set of machine-learning filters. Specifically, we focus on building a set of machine learning classifiers that evaluate documents, and then screen them efficiently. It is a challenging task since the budget is limited and there are countless number of ways to spend the given budget on the problem. We propose a multi-label active learning screening specific sampling technique -- objective-aware sampling -- for querying unlabelled documents for annotating. Our algorithm takes a decision on which machine filter need more training data and how to choose unlabeled items to annotate in order to minimize the risk of overall classification errors rather than minimizing a single filter error. We demonstrate that objective-aware sampling significantly outperforms the state of the art active learning sampling strategies. |
Year | Venue | DocType |
---|---|---|
2020 | CSW@NeurIPS | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Evgeny Krivosheev | 1 | 7 | 4.38 |
Burcu Sayin | 2 | 0 | 0.34 |
Alessandro Bozzon | 3 | 641 | 71.27 |
Zoltán Szlávik | 4 | 116 | 21.40 |