Title
Logistic Regression on Homomorphic Encrypted Data at Scale
Abstract
Machine learning on (homomorphic) encrypted data is a cryptographic method for analyzing private and/or sensitive data while keeping privacy. In the training phase, it takes as input an encrypted training data and outputs an encrypted model without ever decrypting. In the prediction phase, it uses the encrypted model to predict results on new encrypted data. In each phase, no decryption key is needed, and thus the data privacy is ultimately guaranteed. It has many applications in various areas such as finance, education, genomics, and medical field that have sensitive private data. While several studies have been reported on the prediction phase, few studies have been conducted on the training phase. In this paper, we present an efficient algorithm for logistic regression on homomorphic encrypted data, and evaluate our algorithm on real financial data consisting of 422,108 samples over 200 features. Our experiment shows that an encrypted model with a sufficient Kolmogorov Smirnow statistic value can be obtained in similar to 17 hours in a single machine. We also evaluate our algorithm on the public MNIST dataset, and it takes similar to 2 hours to learn an encrypted model with 96.4% accuracy. Considering the inefficiency of homomorphic encryption, our result is encouraging and demonstrates the practical feasibility of the logistic regression training on large encrypted data, for the first time to the best of our knowledge.
Year
DOI
Venue
2019
10.1609/aaai.v33i01.33019466
AAAI
Field
DocType
Volume
Homomorphic encryption,MNIST database,Statistic,Cryptography,Computer science,Inefficiency,Encryption,Artificial intelligence,Information privacy,Logistic regression,Machine learning
Conference
33
Citations 
PageRank 
References 
4
0.42
0
Authors
4
Name
Order
Citations
PageRank
Kyoohyung Han1577.43
Seungwan Hong2124.70
Jung Hee Cheon31787129.74
Daejun Park4746.20