Title
Presto: SQL on Everything
Abstract
Presto is an open source distributed query engine that supports much of the SQL analytics workload at Facebook. Presto is designed to be adaptive, flexible, and extensible. It supports a wide variety of use cases with diverse characteristics. These range from user-facing reporting applications with sub-second latency requirements to multi-hour ETL jobs that aggregate or join terabytes of data. Presto's Connector API allows plugins to provide a high performance I/O interface to dozens of data sources, including Hadoop data warehouses, RDBMSs, NoSQL systems, and stream processing systems. In this paper, we outline a selection of use cases that Presto supports at Facebook. We then describe its architecture and implementation, and call out features and performance optimizations that enable it to support these use cases. Finally, we present performance results that demonstrate the impact of our main design decisions.
Year
DOI
Venue
2019
10.1109/ICDE.2019.00196
2019 IEEE 35th International Conference on Data Engineering (ICDE)
Keywords
Field
DocType
Facebook,Structured Query Language,Connectors,Engines,Optimization,Data warehouses,Tools
SQL,Data warehouse,Use case,Computer science,Terabyte,NoSQL,Plug-in,Stream processing,Analytics,Database
Conference
ISSN
ISBN
Citations 
1084-4627
978-1-5386-7474-1
2
PageRank 
References 
Authors
0.38
0
11
Name
Order
Citations
PageRank
Raghav Sethi120.38
Martin Traverso220.38
Dain Sundstrom320.38
David Phillips420.38
Wenlei Xie548622.55
Yutian Sun620.38
Nezih Yegitbasi720.38
Haozhun Jin820.38
Eric Hwang920.38
Nileema Shingte1020.38
Christopher Berner1140.74