Detecting Undisclosed Paid Editing in Wikipedia - Citegraph

Paper Info

Title
Detecting Undisclosed Paid Editing in Wikipedia

Abstract
Wikipedia, the free and open-collaboration based online encyclopedia, has millions of pages that are maintained by thousands of volunteer editors. As per Wikipedia’s fundamental principles, pages on Wikipedia are written with a neutral point of view and maintained by volunteer editors for free with well-defined guidelines in order to avoid or disclose any conflict of interest. However, there have been several known incidents where editors intentionally violate such guidelines in order to get paid (or even extort money) for maintaining promotional spam articles without disclosing such. In this paper, we address for the first time the problem of identifying undisclosed paid articles in Wikipedia. We propose a machine learning-based framework using a set of features based on both the content of the articles as well as the patterns of edit history of users who create them. To test our approach, we collected and curated a new dataset from English Wikipedia with ground truth on undisclosed paid articles. Our experimental evaluation shows that we can identify undisclosed paid articles with an AUROC of 0.98 and an average precision of 0.91. Moreover, our approach outperforms ORES, a scoring system tool currently used by Wikipedia to automatically detect damaging content, in identifying undisclosed paid articles. Finally, we show that our user-based features can also detect undisclosed paid editors with an AUROC of 0.94 and an average precision of 0.92, outperforming existing approaches.

Year	DOI	Venue
2020	10.1145/3366423.3380055	WWW '20: The Web Conference 2020 Taipei Taiwan April, 2020
Keywords	DocType	ISBN
Wikipedia, Detection of abusive content, Malicious editors, Sock-puppet accounts	Conference	978-1-4503-7023-3
Citations	PageRank	References
0	0.34	0
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Nikesh Joshi	1	0	0.34
Francesca Spezzano	2	80	19.08
Mayson Green	3	0	0.34
Elijah Hill	4	0	0.34

1