Title
Statistical Emerging Pattern Mining with Multiple Testing Correction
Abstract
Emerging patterns are patterns whose support significantly differs between two databases. We study the problem of listing emerging patterns with a multiple testing guarantee. Recently, Terada et al., proposed the Limitless Arity Multiple-testing Procedure (LAMP) that controls the family-wise error rate (FWER) in statistical association mining. LAMP reduces the number of \"untestable\" hypotheses without compromising its statistical power. Still, FWER is restrictive, and as a result, its statistical power is inherently unsatisfying when the number of patterns is large. On the other hand, the false discovery rate (FDR) is less restrictive than FWER, and thus controlling FDR yields a larger number of significant patterns. We propose two emerging pattern mining methods: the first one controls FWER, and the second one controls FDR. The effectiveness of the methods is verified in computer simulations with real-world datasets.
Year
DOI
Venue
2017
10.1145/3097983.3098137
KDD
Keywords
Field
DocType
emerging pattern mining,multiple testing,statistical pattern mining
Data mining,False discovery rate,Arity,Computer science,Word error rate,Multiple comparisons problem,Correlation and dependence,Artificial intelligence,Statistical power,Machine learning
Conference
Citations 
PageRank 
References 
4
0.40
14
Authors
5
Name
Order
Citations
PageRank
Junpei Komiyama1396.84
Masakazu Ishihata2598.70
Hiroki Arimura3113092.90
Takashi Nishibayashi440.40
Shin-ichi Minato572584.72