Title
Privacy preserving regression modelling via distributed computation
Abstract
Reluctance of data owners to share their possibly confidential or proprietary data with others who own related databases is a serious impediment to conducting a mutually beneficial data mining analysis. We address the case of vertically partitioned data -- multiple data owners/agencies each possess a few attributes of every data record. We focus on the case of the agencies wanting to conduct a linear regression analysis with complete records without disclosing values of their own attributes. This paper describes an algorithm that enables such agencies to compute the exact regression coefficients of the global regression equation and also perform some basic goodness-of-fit diagnostics while protecting the confidentiality of their data. In more general settings beyond the privacy scenario, this algorithm can also be viewed as method for the distributed computation for regression analyses.
Year
DOI
Venue
2004
10.1145/1014052.1014139
KDD
Keywords
Field
DocType
multiple data owner,exact regression coefficient,partitioned data,beneficial data mining analysis,global regression equation,data owner,linear regression analysis,proprietary data,regression modelling,regression analysis,data record,regression,secure multi party computation,data mining,regression equation,relational database,distributed computing,data integration,data confidentiality,goodness of fit,data integrity,linear regression,computation
Data integration,Data mining,Multiple data,Secure multi-party computation,Regression,Confidentiality,Regression analysis,Computer science,Artificial intelligence,Machine learning,Linear regression,Computation
Conference
ISBN
Citations 
PageRank 
1-58113-888-1
46
3.00
References 
Authors
9
4
Name
Order
Citations
PageRank
Ashish Sanil115212.81
Alan F. Karr2100576.93
Xiaodong Lin3805.34
Jerome P. Reiter421622.12