Title
A Zest of LIME: Towards Architecture-Independent Model Distances
Abstract
Definitions of the distance between two machine learning models either characterize the similarity of the models' predictions or of their weights. While similarity of weights is attractive because it implies similarity of predictions in the limit, it suffers from being inapplicable to comparing models with different architectures. On the other hand, the similarity of predictions is broadly applicable but depends heavily on the choice of model inputs during comparison. In this paper, we instead propose to compute distance between black-box models by comparing their Local Interpretable Model-Agnostic Explanations (LIME). To compare two models, we take a reference dataset, and locally approximate the models on each reference point with linear models trained by LIME. We then compute the cosine distance between the concatenated weights of the linear models. This yields an approach that is both architecture-independent and possesses the benefits of comparing models in weight space. We empirically show that our method, which we call Zest, can be applied to two problems that require measurements of model similarity: detecting model stealing and machine unlearning.
Year
Venue
Keywords
2022
International Conference on Learning Representations (ICLR)
model distance,model stealing,machine unlearning,fairwashing
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Hengrui Jia100.34
Hongyu Chen200.34
Jonas Guan300.34
Ali Shahin Shamsabadi400.34
Nicolas Papernot5193287.62