Understanding and Mitigating Data Contamination in Deep Anomaly Detection: A Kernel-based Approach. - Citegraph

Paper Info

Title
Understanding and Mitigating Data Contamination in Deep Anomaly Detection: A Kernel-based Approach.

Abstract
Deep anomaly detection has become popular for its capability of handling complex data. However, training a deep detector is fragile to data contamination due to overfitting. In this work, we study the performance of the anomaly detectors under data contamination and construct a data-efficient countermeasure against data contamination. We show that training a deep anomaly detector induces an implicit kernel machine. We then derive an information-theoretic bound of performance degradation with respect to the data contamination ratio. To mitigate the degradation, we propose a contradicting training approach. Apart from learning normality on the contaminated dataset, our approach discourages learning an additional small auxiliary dataset of labeled anomalies. Our approach is much more affordable than constructing a completely clean training dataset. Experiments on public datasets show that our approach significantly improves anomaly detection in the presence of contamination and outperforms some recently proposed detectors.

Year	DOI	Venue
2022	10.24963/ijcai.2022/322	European Conference on Artificial Intelligence
Keywords	DocType	Citations
Data Mining: Anomaly/Outlier Detection,Machine Learning: Kernel Methods	Conference	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Shuang Wu	1	0	1.01
Jingyu Zhao	2	0	0.34
Guangjian Tian	3	14	4.56

1