Title
CLS-Miner: efficient and effective closed high-utility itemset mining
Abstract
High-utility itemset mining (HUIM) is a popular data mining task with applications in numerous domains. However, traditional HUIM algorithms often produce a very large set of high-utility itemsets (HUIs). As a result, analyzing HUIs can be very time consuming for users. Moreover, a large set of HUIs also makes HUIM algorithms less efficient in terms of execution time and memory consumption. To address this problem, closed high-utility itemsets (CHUIs), concise and lossless representations of all HUIs, were proposed recently. Although mining CHUIs is useful and desirable, it remains a computationally expensive task. This is because current algorithms often generate a huge number of candidate itemsets and are unable to prune the search space effectively. In this paper, we address these issues by proposing a novel algorithm called CLS-Miner. The proposed algorithm utilizes the utility-list structure to directly compute the utilities of itemsets without producing candidates. It also introduces three novel strategies to reduce the search space, namely chain-estimated utility co-occurrence pruning, lower branch pruning, and pruning by coverage. Moreover, an effective method for checking whether an itemset is a subset of another itemset is introduced to further reduce the time required for discovering CHUIs. To evaluate the performance of the proposed algorithm and its novel strategies, extensive experiments have been conducted on six benchmark datasets having various characteristics. Results show that the proposed strategies are highly efficient and effective, that the proposed CLS-Miner algorithmoutperforms the current state-ofthe- art CHUD and CHUI-Miner algorithms, and that CLSMiner scales linearly. © 2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
Year
DOI
Venue
2019
10.1007/s11704-016-6245-4
Frontiers of Computer Science
Keywords
Field
DocType
utility mining,high-utility itemset mining,closed itemset mining,closed high-utility itemset mining
Utility mining,CLs upper limits,Effective method,Computer science,Artificial intelligence,Execution time,Machine learning,Lossless compression
Journal
Volume
Issue
ISSN
13
2
20952228
Citations 
PageRank 
References 
3
0.37
33
Authors
4
Name
Order
Citations
PageRank
Dam Thu-Lan1745.24
Kenli Li254058.66
Philippe Fournier-Viger31587110.19
Q.-H. Duong4767.00