Title
Two New Term Weighting Methods For Router Syslogs Anomaly Detection
Abstract
A Router's syslogs are a sequence of events observed and logged by the router. They have been widely used in the system security field. This paper focuses on detecting anomalous behaviors of routers by analyzing router syslogs.For syslog data pre-processing, hierarchical clustering based on counting cousin distance between event patterns is used to cluster events. In order to construct a time series, the length for a time window is set and every time window has a score related to the event clusters in it. Instead of simply treating every event cluster equally in a time window, we assign weights to every event cluster by using four term weighting methods. Two of them -inverse document frequency (IDF) and residual inverse document frequency (RIDF) methods - are widely used in the information retrieval field. Due to their drawbacks, this paper proposes two new weighting methods. The first method, IDFVAR, is a modification of the RIDF method and takes data distribution, frequency, and several additional factors into consideration. In the second method, IDFJMP, JMP value is proposed to evaluate the degree of sudden change of an event cluster. In order to compare those term weighting methods, experiments are done on an estimated data set. Then, we detect anomalous behaviors by using a method derived from standard deviation on the chosen time series. Finally, we conduct experiments on real router syslogs.
Year
DOI
Venue
2016
10.1109/HPCC-SmartCity-DSS.2016.206
PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS)
Keywords
Field
DocType
router syslog, log analysis, anomaly detection, term weighting method, network security, time series
Hierarchical clustering,Data mining,Residual,Anomaly detection,Time series,Weighting,tf–idf,Computer science,Real-time computing,Router,Distributed computing,syslog
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
5
Name
Order
Citations
PageRank
Tunzi Tan101.01
Suixiang Gao24412.48
Wenguo Yang364.20
Yuezhong Song400.68
Chengyong Lin500.34