A Zest of LIME: Towards Architecture-Independent Model Distances - Citegraph

Paper Info

Title
A Zest of LIME: Towards Architecture-Independent Model Distances

Abstract
Definitions of the distance between two machine learning models either characterize the similarity of the models' predictions or of their weights. While similarity of weights is attractive because it implies similarity of predictions in the limit, it suffers from being inapplicable to comparing models with different architectures. On the other hand, the similarity of predictions is broadly applicable but depends heavily on the choice of model inputs during comparison. In this paper, we instead propose to compute distance between black-box models by comparing their Local Interpretable Model-Agnostic Explanations (LIME). To compare two models, we take a reference dataset, and locally approximate the models on each reference point with linear models trained by LIME. We then compute the cosine distance between the concatenated weights of the linear models. This yields an approach that is both architecture-independent and possesses the benefits of comparing models in weight space. We empirically show that our method, which we call Zest, can be applied to two problems that require measurements of model similarity: detecting model stealing and machine unlearning.

Year	Venue	Keywords
2022	International Conference on Learning Representations (ICLR)	model distance,model stealing,machine unlearning,fairwashing
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hengrui Jia	1	0	0.34
Hongyu Chen	2	0	0.34
Jonas Guan	3	0	0.34
Ali Shahin Shamsabadi	4	0	0.34
Nicolas Papernot	5	1932	87.62

1