Title
Learning a Privacy Incidents Database.
Abstract
A repository of privacy incidents is essential for understanding the attributes of products and policies that lead to privacy incidents. We describe our vision for a novel privacy incidents database and our progress toward building a prototype. Key challenges in gathering such a database include bootstrapping and sustainability. We propose a semi-automated framework that can recognize privacy incidents and related information from various online sources such as news, blogs, and social media. The crux of our framework is an incident classifier that identifies whether a piece of text in natural language is related to a privacy incident or not. We curate a dataset consisting of 1324 news articles of which 543 articles are about one or more privacy incidents. We train the incident classifier on this dataset, considering a variety of feature engineering, feature selection, and classification techniques. We find that our incident classifier yields an F1 measure of 93.1%, which is about 12% higher than the keyword search-based baselines we adopt.
Year
DOI
Venue
2017
10.1145/3055305.3055309
HotSoS
DocType
Citations 
PageRank 
Conference
1
0.36
References 
Authors
1
5
Name
Order
Citations
PageRank
Pradeep K. Murukannaiah111015.89
Chinmaya Dabral210.70
Karthik Sheshadri311.04
Esha Sharma410.36
Jessica Staddon51762128.75