Title
Weighted Outlier Detection of High-Dimensional Categorical Data Using Feature Grouping
Abstract
We propose a weighted outlier mining method called WATCH to identify outliers in high-dimensional categorical datasets. WATCH is composed of two distinctive modules: 1) feature grouping by the virtue of correlation measurement among features and 2) outlier mining by assigning scores to objects in each feature groups. At the heart of WATCH is the feature grouping module, which groups an array of features into multiple groups to discover various aspects of feature patterns in each group. The outlier mining module detects outliers from high-dimensional categorical datasets. Except for the number of outliers specified by users, WATCH is conducive to bypassing the optimization of any user-given parameter. We implement and evaluate WATCH using synthetic and real-world datasets. Our experimental results show that WATCH is a promising and practical algorithm to detect outliers in high-dimensional categorical datasets, because WATCH achieves high performance in terms of precision, efficiency, and interpretability.
Year
DOI
Venue
2020
10.1109/TSMC.2018.2847625
IEEE Transactions on Systems, Man, and Cybernetics: Systems
Keywords
DocType
Volume
Categorical data,feature grouping,feature relation,feature weighting,outlier detection
Journal
50
Issue
ISSN
Citations 
11
2168-2216
2
PageRank 
References 
Authors
0.35
14
4
Name
Order
Citations
PageRank
Junli Li121.70
Jifu Zhang29519.42
Ning Pang321.37
Xiao Qin41836125.69