Title
LogSig: generating system events from raw textual logs
Abstract
Modern computing systems generate large amounts of log data. System administrators or domain experts utilize the log data to understand and optimize system behaviors. Most system logs are raw textual and unstructured. One main fundamental challenge in automated log analysis is the generation of system events from raw textual logs. Log messages are relatively short text messages but may have a large vocabulary, which often result in poor performance when applying traditional text clustering techniques to the log data. Other related methods have various limitations and only work well for some particular system logs. In this paper, we propose a message signature based algorithm logSig to generate system events from textual log messages. By searching the most representative message signatures, logSig categorizes log messages into a set of event types. logSig can handle various types of log data, and is able to incorporate human's domain knowledge to achieve a high performance. We conduct experiments on five real system log data. Experiments show that logSig outperforms other alternative algorithms in terms of the overall performance.
Year
DOI
Venue
2011
10.1145/2063576.2063690
CIKM
Keywords
Field
DocType
log message,textual log message,raw textual log,log data,real system log data,system event,modern computing system,automated log analysis,system log,particular system log,domain knowledge,text clustering
Data mining,Domain knowledge,Information retrieval,Computer science,Document clustering,Vocabulary,Computing systems
Conference
Citations 
PageRank 
References 
28
1.03
22
Authors
3
Name
Order
Citations
PageRank
Liang Tang125614.44
Tao Li27216393.45
Chang-Shing Perng347835.92