Mining Frequent Itemsets through Progressive Sampling with Rademacher Averages - Citegraph

Paper Info

Title
Mining Frequent Itemsets through Progressive Sampling with Rademacher Averages

Abstract
We present an algorithm to extract an high-quality approximation of the (top-k) Frequent itemsets (FIs) from random samples of a transactional dataset. With high probability the approximation is a superset of the FIs, and no itemset with frequency much lower than the threshold is included in it. The algorithm employs progressive sampling, with a stopping condition based on bounds to the empirical Rademacher average, a key concept from statistical learning theory. The computation of the bounds uses characteristic quantities that can be obtained efficiently with a single scan of the sample. Therefore, evaluating the stopping condition is fast, and does not require an expensive mining of each sample. Our experimental evaluation confirms the practicality of our approach on real datasets, outperforming approaches based on one-shot static sampling.

Year	DOI	Venue
2015	10.1145/2783258.2783265	ACM Knowledge Discovery and Data Mining
Keywords	Field	DocType
Frequent Itemsets,Pattern Mining,Rademacher Averages,Sampling,Statistical Learning Theory	Statistical learning theory,Data mining,Subset and superset,Computer science,Sampling (statistics),Computation	Conference
Citations	PageRank	References
14	0.57	24
Authors
2

Authors (2 rows)

Cited by (14 rows)

References (24 rows)

Name	Order	Citations	PageRank
Matteo Riondato	1	340	20.63
Eli Upfal	2	4310	743.13

1