Title
Review: Privacy-Preservation in the Context of Natural Language Processing
Abstract
Data privacy is one of the highly discussed issues in recent years as we encounter data breaches and privacy scandals often. This raises a lot of concerns about the ways the data is acquired and the potential information leaks. Especially in the field of Artificial Intelligence (AI), the widely using of AI models aggravates the vulnerability of user privacy because a considerable portion of user data that AI models used is represented in natural language. In the past few years, many researchers have proposed NLP-based methods to address these data privacy challenges. To the best of our knowledge, this is the first interdisciplinary review discussing privacy preservation in the context of NLP. In this paper, we present a comprehensive review of previous research conducted to gather techniques and challenges of building and testing privacy-preserving systems in the context of Natural Language Processing (NLP). We group the different works under four categories: 1) Data privacy in the medical domain, 2) Privacy preservation in the technology domain, 3) Analysis of privacy policies, and 4) Privacy leaks detection in the text representation. This review compares the contributions and pitfalls of the various privacy violation detection and prevention works done using NLP techniques to help guide a path ahead.
Year
DOI
Venue
2021
10.1109/ACCESS.2021.3124163
IEEE ACCESS
Keywords
DocType
Volume
Data privacy, Privacy, Natural language processing, Feature extraction, Task analysis, Data models, Social networking (online), Data privacy, natural language processing, privacy preservation, privacy policy
Journal
9
ISSN
Citations 
PageRank 
2169-3536
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Darshini Mahendran100.34
Changqing Luo200.34
Bridget T. McInnes328023.66