Title
Collaborative PCA/DCA Learning Methods for Compressive Privacy.
Abstract
In the Internet era, the data being collected on consumers like us are growing exponentially, and attacks on our privacy are becoming a real threat. To better ensure our privacy, it is safer to let the data owner control the data to be uploaded to the network as opposed to taking chance with data servers or third parties. To this end, we propose compressive privacy, a privacy-preserving technique to enable the data creator to compress data via collaborative learning so that the compressed data uploaded onto the Internet will be useful only for the intended utility and not be easily diverted to malicious applications. For data in a high-dimensional feature vector space, a common approach to data compression is dimension reduction or, equivalently, subspace projection. The most prominent tool is principal component analysis (PCA). For unsupervised learning, PCA can best recover the original data given a specific reduced dimensionality. However, for the supervised learning environment, it is more effective to adopt a supervised PCA, known as discriminant component analysis (DCA), to maximize the discriminant capability. The DCA subspace analysis embraces two different subspaces. The signal-subspace components of DCA are associated with the discriminant distance/power (related to the classification effectiveness), whereas the noise subspace components of DCA are tightly coupled with recoverability and/or privacy protection. This article presents three DCA-related data compression methods useful for privacy-preserving applications: —Utility-driven DCA: Because the rank of the signal subspace is limited by the number of classes, DCA can effectively support classification using a relatively small dimensionality (i.e., high compression). —Desensitized PCA: By incorporating a signal-subspace ridge into DCA, it leads to a variant especially effective for extracting privacy-preserving components. In this case, the eigenvalues of the noise-space are made to become insensitive to the privacy labels and are ordered according to their corresponding component powers. —Desensitized K-means/SOM: Since the revelation of the K-means or SOM cluster structure could leak sensitive information, it is safer to perform K-means or SOM clustering on a desensitized PCA subspace.
Year
DOI
Venue
2017
10.1145/2996460
ACM Trans. Embedded Comput. Syst.
Keywords
Field
DocType
DCA,PCA,compressive privacy,K-means,face-recognition,KDCA
Data mining,k-means clustering,Dimensionality reduction,Pattern recognition,Subspace topology,Computer science,Supervised learning,Unsupervised learning,Artificial intelligence,Cluster analysis,Signal subspace,Principal component analysis
Journal
Volume
Issue
ISSN
16
3
1539-9087
Citations 
PageRank 
References 
3
0.69
3
Authors
4
Name
Order
Citations
PageRank
Sun-Yuan Kung11853256.39
Thee Chanyaswad282.63
J. Morris Chang310113.18
Pei Yuan Wu4163.96