Detecting Bias in Black-Box Models Using Transparent Model Distillation. - Citegraph

Paper Info

Title
Detecting Bias in Black-Box Models Using Transparent Model Distillation.

Abstract
Black-box risk scoring models permeate our lives, yet are typically proprietary and opaque. We propose a transparent model distillation approach to detect bias in such models. Model distillation was originally designed to distill knowledge from a large, complex teacher model to a faster, simpler student model without significant loss in prediction accuracy. We add a third restriction - transparency. In this paper we use data sets that contain two labels to train on: the risk score predicted by a black-box model, as well as the actual outcome the risk score was intended to predict. This allows us to compare models that predict each label. For a particular class of student models - interpretable tree additive models with pairwise interactions (GA2Ms) - we provide confidence intervals for the difference between the risk score and actual outcome models. This presents a new method for detecting bias in black-box risk scores by assessing if contributions of protected features to the risk score are statistically different from their contributions to the actual outcome.

Year	Venue	Field
2017	arXiv: Machine Learning	Framingham Risk Score,Black box (phreaking),Pairwise comparison,Data mining,Data set,Additive model,Distillation,Artificial intelligence,Confidence interval,Mathematics,Machine learning
DocType	Volume	Citations
Journal	abs/1710.06169	4
PageRank	References	Authors
0.43	21	4

Authors (4 rows)

Cited by (4 rows)

References (21 rows)

Name	Order	Citations	PageRank
Sarah Tan	1	9	3.68
Rich Caruana	2	4503	655.71
Giles Hooker	3	59	13.40
Yin Lou	4	506	28.82

1