Title
Threatening Privacy across Social Graphs: A Structural Features Approach
Abstract
Can privacy of individuals in an anonymized social graph be threatened by a handful of structural features i.e., Without knowledge of link-level connectivity? The answer is yes. We define threatening privacy as being able to narrow down each anonymized individual's identity to a small set of known individuals that is most likely to include the anonymized individual. We call this set the Identity+ or I+ set. In this manner, some anonymized individuals get associated with smaller I+ sets and others to larger I+ sets. This distinction is how their privacy is threatened. To find I+ sets, we utilize (1) the structural features released in the anonymized graph (such as a node's degree, clustering coefficient, and average degree of neighbors) and (2) information in an auxiliary (publicly available) graph. To our knowledge, our work is the first of its kind that only uses structural features of graphs for deanonymization. Previous works assume the adjacency matrix of the anonymized graph is released and hence rely on the sparsity of the human behavior exhibited in the adjacency matrix. We will show that structural features are not sparse (i.e., Have few zero entries). In fact, there are many lookalikes when one only inspects structural features. Our proposed approach, RRID+, recursively clusters and matches structural features of the anonymized and auxiliary graphs in a top-down greedy manner. Averaged over a range of real graphs, RRID+ is able to find the correct cluster in the auxiliary data for 37% of the individuals (i.e., Recall = 37%) and narrow down their identity to 26% of the population in the auxiliary data (i.e., Precision on recalled nodes = 74%). Our experiment -- involving multiple synthetic and real graphs plus different noise models (between the anonymized and auxiliary graphs -- showcase how various parameters can affect precision and recall.
Year
DOI
Venue
2014
10.1109/ICDMW.2014.151
Data Mining Workshop
Keywords
DocType
Citations 
data privacy,graph theory,graphs,matrix algebra,adjacency matrix,anonymized graph,anonymized individual,anonymized social graph,auxiliary data,auxiliary graphs,deanonymization,human behavior,link-level connectivity,social graphs,structural features,threatening privacy,top-down greedy manner,Social graphs,deanonymization,graph mining,re-identification
Conference
0
PageRank 
References 
Authors
0.34
13
6
Name
Order
Citations
PageRank
Priya Govindan1373.30
Tina Eliassi-Rad21597108.63
Jin Xu36210.08
Shawndra Hill400.34
Chris Volinsky54325194.44
Eliassi-Rad, T.600.34