Abstract | ||
---|---|---|
Frequent Itemset Mining is a popular data mining task with the aim of discovering frequently co-occurring items and, hence, correlations, hidden in data. Many attempts to apply this family of techniques to Big Data have been presented. Unfortunately, few implementations proved to efficiently scale to huge collections of information. This review presents a comparison of a carefully selected subset of the most efficient and scalable approaches. Focusing on Hadoop and Spark platforms, we consider not only the analysis dimensions typical of the data mining domain, but also criteria to be valued in the Big Data environment. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1007/978-3-319-23201-0_27 | Communications in Computer and Information Science |
Keywords | Field | DocType |
Frequent Itemset Mining,MapReduce,Spark,Data mining | Data science,Data mining,Spark (mathematics),Computer science,Implementation,Big data,Database,Scalability | Conference |
Volume | ISSN | Citations |
539 | 1865-0929 | 0 |
PageRank | References | Authors |
0.34 | 6 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daniele Apiletti | 1 | 104 | 11.69 |
Paolo Garza | 2 | 426 | 39.13 |
Fabio Pulvirenti | 3 | 12 | 2.92 |