Title
Finding interesting associations without support pruning
Abstract
Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a priori algorithm is only effective when the only rules of interest are relationships that occur very frequently. However, there are a number of applications, such as data mining, identification of similar web documents, clustering, and collaborative filtering, where the rules of interest have comparatively few instances in the data. In these cases, we must look for highly correlated items, or possibly even causal relationships between infrequent items. We develop a family of algorithms for solving this problem, employing a combination of random sampling and hashing techniques. We provide analysis of the algorithms developed and conduct experiments on real and synthetic data to obtain a comparative performance analysis.
Year
DOI
Venue
2000
10.1109/69.908981
Knowledge and Data Engineering, IEEE Transactions
Keywords
Field
DocType
data mining,database theory,software performance evaluation,very large databases,Web documents,association rule mining,causal relationships,collaborative filtering,data mining,experiments,hashing,large databases,performance analysis,random sampling,similarity metric
Data mining,Importance sampling,Data stream mining,Collaborative filtering,Computer science,Association rule learning,Synthetic data,Sampling (statistics),Hash function,Cluster analysis,Database
Conference
Volume
Issue
ISSN
13
1
1041-4347
ISBN
Citations 
PageRank 
0-7695-0506-6
249
26.60
References 
Authors
17
4
Search Limit
100249
Name
Order
Citations
PageRank
Edith Cohen13260268.21
Datar, M.224926.60
Fujiwara, S.324926.60
Aristides Gionis46808386.81