Abstract | ||
---|---|---|
Enterprises operate large data lakes using Hadoop and Spark frameworks that (1) run a plethora of tools to automate powerful data preparation/transformation pipelines, (2) run on shared, large clusters to (3) perform many different analytics tasks ranging from model preparation, building, evaluation, and tuning for both machine learning and deep learning. Developing machine/deep learning models on data in such shared environments is challenging. Apache SystemML provides a unified framework for implementing machine learning and deep learning algorithms in a variety of shared deployment scenarios. SystemMLu0027s novel compilation approach automatically generates runtime execution plans for machine/deep learning algorithms that are composed of single-node and distributed runtime operations depending on data and cluster characteristics such as data size, data sparsity, cluster size, and memory configurations, while still exploiting the capabilities of the underlying big data frameworks. |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Learning | Pipeline transport,Software deployment,Spark (mathematics),Ranging,Artificial intelligence,Deep learning,Analytics,Big data,Data preparation,Machine learning,Mathematics |
DocType | Volume | Citations |
Journal | abs/1802.04647 | 0 |
PageRank | References | Authors |
0.34 | 6 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Niketan Pansare | 1 | 181 | 7.15 |
Michael Dusenberry | 2 | 38 | 1.59 |
Nakul Jindal | 3 | 0 | 0.34 |
Matthias Boehm | 4 | 127 | 6.17 |
Berthold Reinwald | 5 | 901 | 79.37 |
Prithviraj Sen | 6 | 837 | 38.24 |