Title
PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models
Abstract
The ubiquitous use of machine learning algorithms brings new challenges to traditional database problems such as incremental view update. Much effort is being put in better understanding and debugging machine learning models, as well as in identifying and repairing errors in training datasets. Our focus is on how to assist these activities when they have to retrain the machine learning model after removing problematic training samples in cleaning or selecting different subsets of training data for interpretability. This paper presents an efficient provenance-based approach, PrIU, and its optimized version, PrIU-opt, for incrementally updating model parameters without sacrificing prediction accuracy. We prove the correctness and convergence of the incrementally updated model parameters, and validate it experimentally. Experimental results show that up to two orders of magnitude speed-ups can be achieved by PrIU-opt compared to simply retraining the model from scratch, yet obtaining highly similar models.
Year
DOI
Venue
2020
10.1145/3318464.3380571
SIGMOD/PODS '20: International Conference on Management of Data Portland OR USA June, 2020
Keywords
DocType
ISBN
Data provenance, machine learning, deletion propagation
Conference
978-1-4503-6735-6
Citations 
PageRank 
References 
0
0.34
27
Authors
3
Name
Order
Citations
PageRank
Yinjun Wu1114.39
Val Tannen22367518.95
Susan Davidson330761016.56