Title
Prefix-Graph: A Versatile Log Parsing Approach Merging Prefix Tree With Probabilistic Graph
Abstract
Logs play an important part in analyzing system behavior and diagnosing system failures. As the basic step of log analysis, log parsing converts raw log messages into structured log templates. However, existing log parsing approaches are not adaptive and versatile enough to ensure their high accuracy on all types of datasets. In particular, it is required to design regular expressions or fine-tune the hyper-parameters manually for the best performance. In this paper, we propose Prefix-Graph, an online versatile log parsing approach. Prefix-Graph is a probabilistic graph structure extended from prefix tree. It iteratively merges together two branches which have high similarity in probability distribution, and represents log templates as the combination of cut-edges in root-to-leaf paths of the graph. Since no domain knowledge is used and all the parameters are fixed, Prefix-Graph can be easily applied to different log datasets without any additional manual work. We evaluate our approach on 10 real-world datasets and 117GB log messages obtained from Huawei. The experimental results demonstrate that Prefix-Graph achieves the highest average accuracy of 0.975 and the smallest standard deviation of 0.037. Our approach is superior to baseline methods in terms of adaptability and versatility.
Year
DOI
Venue
2021
10.1109/ICDE51399.2021.00274
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021)
Keywords
DocType
ISSN
log parsing, template extraction, prefix tree, probabilistic graph
Conference
1084-4627
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Guojun Chu100.34
J. Wang247995.23
Qi Qi321056.01
Haifeng Sun46827.77
Shimin Tao500.34
Jianxin Liao645782.08