Title
An efficient structure for fast mining high utility itemsets.
Abstract
High utility itemset mining has emerged to be an important research issue in data mining since it has a wide range of real life applications. Although a number of algorithms have been proposed in recent years, the mining efficiency is still a big challenge since these algorithms suffer from either the problem of low efficiency of calculating candidates’ utilities or the problem of generating huge number of candidates. In this paper, we propose a novel data structure named PUN-list (PU-tree-Node list), which maintains both the utility information about an itemset and utility upper bound for facilitating the processing of mining high utility itemsets. Based on PUN-lists, we present a method, named MIP (Mining high utility Itemset using PUN-Lists), for efficiently mining high utility itemsets. The efficiency of MIP is achieved with three techniques. First, itemsets are represented by a highly condensed data structure, named PUN-list, which avoids costly and repeated utility computation. Second, the utility of an itemset can be efficiently calculated by scanning the PUN-list of the itemset and the PUN-lists of long itemsets can be efficiently constructed by the PUN-lists of short itemsets. Third, by employing the utility upper bound lying in the PUN-lists as the pruning strategy, MIP directly discovers high utility itemsets from the search space, named set-enumeration tree, without generating numerous candidates. Extensive experiments on various synthetic and real datasets show that MIP is very efficient since it is much faster than HUI-Miner, d2HUP, and UP-Growth + , especially on dense datasets.
Year
DOI
Venue
2018
10.1007/s10489-017-1130-x
Appl. Intell.
Keywords
Field
DocType
Data structure,Data mining,High utility itemset,PUN-list,Utility mining
Data mining,Data structure,Computer science,Upper and lower bounds,Database,Computation
Journal
Volume
Issue
ISSN
48
9
0924-669X
Citations 
PageRank 
References 
3
0.37
16
Authors
1
Name
Order
Citations
PageRank
Zhi-Hong Deng118523.33