Title
Effect of Count Estimation in Finding Frequent Itemsets over Online Transactional Data Streams
Abstract
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to this reason, most algorithms for data streams sacrifice the correctness of their results for fast processing time. The processing time is greatly influenced by the amount of information that should be maintained. This issue becomes more serious in finding frequent itemsets or frequency counting over an online transactional data stream since there can be a large number of itemsets to be monitored. We have proposed a method called theestDec method for finding frequent itemsets over an online data stream. In order to reduce the number of monitored itemsets in this method, monitoring the count of an itemset is delayed until its support is large enough to become a frequent itemset in the near future. For this purpose, the count of an itemset should be estimated. Consequently, how to estimate the count of an itemset is a critical issue in minimizing memory usage as well as processing time. In this paper, the effects of various count estimation methods for finding frequent itemsets are analyzed in terms of mining accuracy, memory usage and processing time.
Year
DOI
Venue
2005
10.1007/s11390-005-0007-3
J. Comput. Sci. Technol.
Keywords
Field
DocType
transaction data
Data mining,Data stream mining,Data stream,Computer science,Correctness,STREAMS,Transaction data
Journal
Volume
Issue
ISSN
20
1
1860-4749
Citations 
PageRank 
References 
3
0.40
6
Authors
2
Name
Order
Citations
PageRank
Joong Hyuk Chang140119.81
Won Suk Lee253651.26