Title
A MapReduce Reinforced Distributed Sequential Pattern Mining Algorithm.
Abstract
Redesign and reimplementation of traditional sequential pattern mining algorithms on distributed computing frameworks are essential for dealing with big data. Along the way, the critical issue is how to minimize the communication overhead of the distributed sequential pattern mining algorithm and maximize its execution efficiency by balancing the workload of distributed computing resources. To address such an issue, this paper proposes a MapReduce reinforced distributed sequential pattern mining algorithm DGSP Distributed GSP algorithm based on MapReduce, which consists of two MapReduce jobs. The \"two-jobs\" structure of DGSP can effectively reduce the communication overhead of the distributed sequential pattern mining algorithm. DGSP also enables optimizing the workload balance and the execution efficiency of distributed sequential pattern mining by evenly partitioning the database and assigning the fragments to Map workers. Experimental results indicate that DGSP can significantly improve the overall performance, scalability and fault tolerance of sequential pattern mining on big data.
Year
DOI
Venue
2015
10.1007/978-3-319-27122-4_13
ICA3PP
Field
DocType
Citations 
Computer science,GSP Algorithm,Workload,Parallel computing,Distributed design patterns,Algorithm,Distributed algorithm,Fault tolerance,Sequential Pattern Mining,Big data,Scalability,Distributed computing
Conference
3
PageRank 
References 
Authors
0.36
11
5
Name
Order
Citations
PageRank
Xiao Yu1186.14
Jin Liu2425.98
Xiao Liu311115.20
Chuanxiang Ma441.04
Bin Li56827.40