Title
Building the Enterprise Fabric for Big Data with Vertica and Spark Integration.
Abstract
Enterprise customers increasingly require greater flexibility in the way they access and process their Big Data while at the same time they continue to request advanced analytics and access to diverse data sources. Yet customers also still require the robustness of enterprise class analytics for their mission-critical data. In this paper, we present our initial efforts toward a solution that satisfies the above requirements by integrating the HPE Vertica enterprise database with Apache Spark's open source big data computation engine. In particular, it enables fast, reliable transferring of data between Vertica and Spark; and deploying Machine Learning models created by Spark into Vertica for predictive analytics on Vertica data. This integration provides a fabric on which our customers get the best of both worlds: it extends Vertica's extensive SQL analytics capabilities with Spark's machine learning library (MLlib), giving Vertica users access to a wide range of ML functions; it also enables customers to leverage Spark as an advanced ETL engine for all data that require the guarantees offered by Vertica.
Year
DOI
Venue
2016
10.1145/2882903.2903744
SIGMOD Conference
Field
DocType
Citations 
SQL,Data mining,Predictive Model Markup Language,Spark (mathematics),Predictive analytics,Computer science,Robustness (computer science),Analytics,Big data,Database
Conference
1
PageRank 
References 
Authors
0.36
7
7
Name
Order
Citations
PageRank
Jeff LeFevre115511.32
Rui Liu2256.45
Cornelio Inigo310.36
Lupita Paz430.84
Edward Ma511.38
Malú Castellanos6164.84
Meichun Hsu73437778.34