DBPal: A Fully Pluggable NL2SQL Training Pipeline - Citegraph

Paper Info

Title
DBPal: A Fully Pluggable NL2SQL Training Pipeline

Abstract
Natural language is a promising alternative interface to DBMSs because it enables non-technical users to formulate complex questions in a more concise manner than SQL. Recently, deep learning has gained traction for translating natural language to SQL, since similar ideas have been successful in the related domain of machine translation. However, the core problem with existing deep learning approaches is that they require an enormous amount of training data in order to provide accurate translations. This training data is extremely expensive to curate, since it generally requires humans to manually annotate natural language examples with the corresponding SQL queries (or vice versa). Based on these observations, we propose DBPal, a new approach that augments existing deep learning techniques in order to improve the performance of models for natural language to SQL translation. More specifically, we present a novel training pipeline that automatically generates synthetic training data in order to (1) improve overall translation accuracy, (2) increase robustness to linguistic variation, and (3) specialize the model for the target database. As we show, our DBPal training pipeline is able to improve both the accuracy and linguistic robustness of state-of-the-art natural language to SQL translation models.

Year	DOI	Venue
2020	10.1145/3318464.3380589	SIGMOD/PODS '20: International Conference on Management of Data Portland OR USA June, 2020
DocType	ISBN	Citations
Conference	978-1-4503-6735-6	1
PageRank	References	Authors
0.35	37	12

Authors (12 rows)

Cited by (1 rows)

References (37 rows)

Name	Order	Citations	PageRank
Nathaniel Weir	1	3	2.40
Prasetya Utama	2	3	3.07
Alex Galakatos	3	20	2.13
Andrew Crotty	4	100	9.10
Amir Rahimzadeh Ilkhechi	5	12	2.66
Shekar Ramaswamy	6	1	0.35
Rohin Bhushan	7	1	0.35
Nadja Geisler	8	1	1.36
Benjamin Hättasch	9	3	4.09
Steffen Eger	10	77	25.00
Ugur Cetintemel	11	445	28.03
Carsten Binnig	12	619	61.38

1