Abstract | ||
---|---|---|
Data is the natural resource of the 21st century. It is being produced at dizzying rates, e.g., for genomics by sequencers, for Media and Entertainment with very high resolution formats, and for Internet of Things (IoT) by multitudes of sensors. Object Stores such as AWS S3, Azure Blob storage, and IBM Cloud Object Storage, are highly scalable distributed storage systems that offer high capacity, cost effective storage for this data. But it is not enough just to store data; we also need to derive value from it. Apache Spark is the leading big data analytics processing engine. It runs up to one hundred times faster than Hadoop MapReduce and combines SQL, streaming and complex analytics. In this poster we present Stocator, a high performance storage connector, that enables Spark to work directly on data stored in object storage systems. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1145/3078468.3078496 | SYSTOR |
DocType | Volume | Citations |
Journal | abs/1709.01812 | 1 |
PageRank | References | Authors |
0.37 | 9 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Gil Vernik | 1 | 32 | 4.15 |
Michael Factor | 2 | 3 | 2.03 |
Elliot K. Kolodner | 3 | 483 | 83.65 |
Effi Ofer | 4 | 3 | 2.35 |
Pietro Michiardi | 5 | 1512 | 111.53 |
Francesco Pace | 6 | 4 | 2.13 |