Abstract | ||
---|---|---|
Discovering high utility itemsets in transaction databases is a key task for studying the behavior of customers. It consists of finding groups of items bought together that yield a high profit. Several algorithms have been proposed to mine high utility itemsets using various approaches and more or less complex data structures. Among existing algorithms, one-phase algorithms employing the utility-list structure have shown to be the most efficient. In recent years, the simplicity of the utility-list structure has led to the development of numerous utility-list based algorithms for various tasks related to utility mining. However, a major limitation of utility-list based algorithms is that creating and maintaining utility-lists are time consuming and can consume a huge amount of memory. The reasons are that numerous utility lists are built and that the utility-list intersection/join operation to construct a utility-list is costly. This paper addresses this issue by proposing an improved utility-list structure called utility-list buffer to reduce the memory consumption and speed up the join operation. This structure is integrated into a novel algorithm named ULB-Miner (Utility-List Buffer for high utility itemset Miner), which introduces several new ideas to more efficiently discover high utility itemsets. ULB-Miner uses the designed utility-list buffer structure to efficiently store and retrieve utility-lists, and reuse memory during the mining process. Moreover, the paper also introduces a linear time method for constructing utility-list segments in a utility-list buffer. An extensive experimental study on various datasets shows that the proposed algorithm relying on the novel utility-list buffer structure is highly efficient in terms of both execution time and memory consumption. The ULB-Miner algorithm is up to 10 times faster than the FHM and HUI-Miner algorithms and consumes up to 6 times less memory. Moreover, it performs well on both dense and sparse datasets. © 2017 Springer Science+Business Media, LLC |
Year | DOI | Venue |
---|---|---|
2018 | 10.1007/s10489-017-1057-2 | APPLIED INTELLIGENCE |
Keywords | Field | DocType |
Pattern mining,Itemset mining,Utility mining,Utility list,Utility list buffer | Data mining,Utility mining,Computer science,Artificial intelligence,Database transaction,Machine learning | Journal |
Volume | Issue | ISSN |
48 | 7 | 0924-669X |
Citations | PageRank | References |
8 | 0.46 | 32 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Q.-H. Duong | 1 | 76 | 7.00 |
Philippe Fournier-Viger | 2 | 1587 | 110.19 |
Heri Ramampiaro | 3 | 154 | 20.46 |
Kjetil Nørvåg | 4 | 1311 | 79.26 |
Dam Thu-Lan | 5 | 74 | 5.24 |