Title
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
Abstract
We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets:(1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot-value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at https://yale-lily.github.io/cosql.
Year
DOI
Venue
2019
10.18653/v1/D19-1204
EMNLP/IJCNLP (1)
DocType
Volume
Citations 
Conference
D19-1
3
PageRank 
References 
Authors
0.38
0
24
Name
Order
Citations
PageRank
Tao Yu1256.78
Rui Zhang2688.10
Heyang Er350.74
Suyi Li430.72
Eric Xue550.74
Bo Pang630.38
Victoria Lin7453.39
Yi Chern Tan842.08
Tianze Shi9346.29
Zihan Li1031.05
Youxuan Jiang1130.38
Michihiro Yasunaga12285.12
Sungrok Shim1350.74
Tao Chen1430.72
Alexander Richard Fabbri1544.79
Zifan Li16266.89
Luyao Chen1730.38
Yuwen Zhang1830.72
Shreya Dixit1930.38
Vincent Zhang2030.38
Caiming Xiong2196969.56
Richard Socher226770230.61
Walter Lasecki2372.29
Dragomir Radev2451.08