Title
Utility-preserving sanitization of semantically correlated terms in textual documents.
Abstract
•A general framework to measure the disclosure risk in document sanitization/redaction is presented.•Notions of the Information Theory are exploited to detect and sanitize risky correlated terms.•Special efforts are put in preserving the utility of the sanitized output.•A practical implementation of the theoretical method is detailed.•The evaluation shows an improvement of data utility in comparison with the previous work while retaining a similar accuracy.
Year
DOI
Venue
2014
10.1016/j.ins.2014.03.103
Information Sciences
Keywords
Field
DocType
Data privacy,Document redaction,Document sanitization,Information theory
Information theory,Redaction,Information retrieval,Computer science,Generalization,Declassification,Information privacy,Semantics
Journal
Volume
ISSN
Citations 
279
0020-0255
11
PageRank 
References 
Authors
0.58
27
3
Name
Order
Citations
PageRank
David Sánchez139532.93
Montserrat Batet289937.20
Alexandre Viejo335225.61