Abstract | ||
---|---|---|
Black-box risk scoring models permeate our lives, yet are typically proprietary and opaque. We propose a transparent model distillation approach to detect bias in such models. Model distillation was originally designed to distill knowledge from a large, complex teacher model to a faster, simpler student model without significant loss in prediction accuracy. We add a third restriction - transparency. In this paper we use data sets that contain two labels to train on: the risk score predicted by a black-box model, as well as the actual outcome the risk score was intended to predict. This allows us to compare models that predict each label. For a particular class of student models - interpretable tree additive models with pairwise interactions (GA2Ms) - we provide confidence intervals for the difference between the risk score and actual outcome models. This presents a new method for detecting bias in black-box risk scores by assessing if contributions of protected features to the risk score are statistically different from their contributions to the actual outcome. |
Year | Venue | Field |
---|---|---|
2017 | arXiv: Machine Learning | Framingham Risk Score,Black box (phreaking),Pairwise comparison,Data mining,Data set,Additive model,Distillation,Artificial intelligence,Confidence interval,Mathematics,Machine learning |
DocType | Volume | Citations |
Journal | abs/1710.06169 | 4 |
PageRank | References | Authors |
0.43 | 21 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sarah Tan | 1 | 9 | 3.68 |
Rich Caruana | 2 | 4503 | 655.71 |
Giles Hooker | 3 | 59 | 13.40 |
Yin Lou | 4 | 506 | 28.82 |