Explaining Classification Models Built on High-Dimensional Sparse Data. - Citegraph

Paper Info

Title
Explaining Classification Models Built on High-Dimensional Sparse Data.

Abstract
Predictive modeling applications increasingly use data representing peopleu0027s behavior, opinions, and interactions. Fine-grained behavior data often has different structure from traditional data, being very high-dimensional and sparse. Models built from these data are quite difficult to interpret, since they contain many thousands or even many millions of features. Listing features with large model coefficients is not sufficient, because the model coefficients do not incorporate information on feature presence, which is key when analysing sparse data. In this paper we introduce two alternatives for explaining predictive models by listing important features. We evaluate these alternatives in terms of explanation bang for the buck,, i.e., how many examplesu0027 inferences are explained for a given number of features listed. The bottom line: (i) The proposed alternatives have double the bang-for-the-buck as compared to just listing the high-coefficient features, and (ii) interestingly, although they come from different sources and motivations, the two new alternatives provide strikingly similar rankings of important features.

Year	Venue	Field
2016	arXiv: Machine Learning	Data mining,Computer science,Artificial intelligence,Sparse matrix,Machine learning
DocType	Volume	Citations
Journal	abs/1607.06280	1
PageRank	References	Authors
0.34	3	4

Authors (4 rows)

Cited by (1 rows)

References (3 rows)

Name	Order	Citations	PageRank
Julie Moeyersoms	1	23	2.61
Brian Dalessandro	2	171	13.02
Foster J. Provost	3	16	1.56
David Martens	4	66	9.52

1