Title
Stocator: a high performance object store connector for spark.
Abstract
Data is the natural resource of the 21st century. It is being produced at dizzying rates, e.g., for genomics by sequencers, for Media and Entertainment with very high resolution formats, and for Internet of Things (IoT) by multitudes of sensors. Object Stores such as AWS S3, Azure Blob storage, and IBM Cloud Object Storage, are highly scalable distributed storage systems that offer high capacity, cost effective storage for this data. But it is not enough just to store data; we also need to derive value from it. Apache Spark is the leading big data analytics processing engine. It runs up to one hundred times faster than Hadoop MapReduce and combines SQL, streaming and complex analytics. In this poster we present Stocator, a high performance storage connector, that enables Spark to work directly on data stored in object storage systems.
Year
DOI
Venue
2017
10.1145/3078468.3078496
SYSTOR
DocType
Volume
Citations 
Journal
abs/1709.01812
1
PageRank 
References 
Authors
0.37
9
6
Name
Order
Citations
PageRank
Gil Vernik1324.15
Michael Factor232.03
Elliot K. Kolodner348383.65
Effi Ofer432.35
Pietro Michiardi51512111.53
Francesco Pace642.13