Abstract | ||
---|---|---|
One approach to analysis of private data is ε-differential privacy, a randomization-based approach that protects individual data items by injecting carefully limited noise into results. A challenge in applying this to private data analysis is that the noise added to the feature parameters is directly proportional to the number of parameters learned. While careful feature selection would alleviate this problem, the process of feature selection itself can reveal private information, requiring the application of differential privacy to the feature selection process. In this paper, we analyze the sensitivity of various feature selection techniques used in data mining and show that some of them are not suitable for differentially private analysis due to high sensitivity. We give experimental results showing the value of using low sensitivity feature selection techniques. We also show that the same concepts can be used to improve differentially private decision trees.
|
Year | DOI | Venue |
---|---|---|
2018 | 10.1145/3180445.3180452 | IWSPA@CODASPY |
Keywords | Field | DocType |
Differential privacy, sensitivity, data mining, classification, decision trees, naive bayes, feature selection, privacy preserving data mining | Decision tree,Data mining,Differential privacy,Feature selection,Naive Bayes classifier,Computer science,Private information retrieval | Conference |
ISBN | Citations | PageRank |
978-1-4503-5634-3 | 0 | 0.34 |
References | Authors | |
10 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Balamurugan Anandan | 1 | 28 | 2.35 |
Chris Clifton | 2 | 3327 | 544.44 |