Abstract | ||
---|---|---|
Event correlation is a cornerstone for process discovery over event logs crossing multiple data sources. The computed correlation rules and process instances will greatly help us to unleash the power of process mining. However, exploring all possible event correlations over a log could be time consuming, especially when the log is large. State-of-the-art methods based on MapReduce designed to handle this challenge have offered significant performance improvements over standalone implementations. However, all existing techniques are still based on a conventional generating-and-pruning scheme. Therefore, event partitioning across multiple machines is often inefficient. In this paper, following the principle of filtering-and-verification, we propose a new algorithm, called RF-GraP, which provides a more efficient correlation over distributed systems. We present the detailed implementation of our approach and conduct a quantitative evaluation using the Spark platform. Experimental results demonstrate that the proposed method is indeed efficient. Compared to the state-of-the-art, we are able to achieve significant performance speedups with obviously less network communication. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/CCGRID.2017.94 | CCGrid |
Keywords | Field | DocType |
event correlation, process mining, service computing, data partitioning, big data, data-intensive computing | Spark (mathematics),Algorithm design,Data-intensive computing,Computer science,Event correlation,Event partitioning,Business process discovery,Big data,Process mining,Distributed computing | Conference |
ISSN | ISBN | Citations |
2376-4414 | 978-1-5090-5980-5 | 2 |
PageRank | References | Authors |
0.37 | 21 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Long Cheng | 1 | 91 | 16.99 |
Boudewijn F. van Dongen | 2 | 1875 | 97.84 |
Wil Van Der Aalst | 3 | 20894 | 1418.27 |