Title
Efficient high utility itemset mining using buffered utility-lists
Abstract
Discovering high utility itemsets in transaction databases is a key task for studying the behavior of customers. It consists of finding groups of items bought together that yield a high profit. Several algorithms have been proposed to mine high utility itemsets using various approaches and more or less complex data structures. Among existing algorithms, one-phase algorithms employing the utility-list structure have shown to be the most efficient. In recent years, the simplicity of the utility-list structure has led to the development of numerous utility-list based algorithms for various tasks related to utility mining. However, a major limitation of utility-list based algorithms is that creating and maintaining utility-lists are time consuming and can consume a huge amount of memory. The reasons are that numerous utility lists are built and that the utility-list intersection/join operation to construct a utility-list is costly. This paper addresses this issue by proposing an improved utility-list structure called utility-list buffer to reduce the memory consumption and speed up the join operation. This structure is integrated into a novel algorithm named ULB-Miner (Utility-List Buffer for high utility itemset Miner), which introduces several new ideas to more efficiently discover high utility itemsets. ULB-Miner uses the designed utility-list buffer structure to efficiently store and retrieve utility-lists, and reuse memory during the mining process. Moreover, the paper also introduces a linear time method for constructing utility-list segments in a utility-list buffer. An extensive experimental study on various datasets shows that the proposed algorithm relying on the novel utility-list buffer structure is highly efficient in terms of both execution time and memory consumption. The ULB-Miner algorithm is up to 10 times faster than the FHM and HUI-Miner algorithms and consumes up to 6 times less memory. Moreover, it performs well on both dense and sparse datasets. © 2017 Springer Science+Business Media, LLC
Year
DOI
Venue
2018
10.1007/s10489-017-1057-2
APPLIED INTELLIGENCE
Keywords
Field
DocType
Pattern mining,Itemset mining,Utility mining,Utility list,Utility list buffer
Data mining,Utility mining,Computer science,Artificial intelligence,Database transaction,Machine learning
Journal
Volume
Issue
ISSN
48
7
0924-669X
Citations 
PageRank 
References 
8
0.46
32
Authors
5
Name
Order
Citations
PageRank
Q.-H. Duong1767.00
Philippe Fournier-Viger21587110.19
Heri Ramampiaro315420.46
Kjetil Nørvåg4131179.26
Dam Thu-Lan5745.24