Title
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation.
Abstract
Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation. CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison. CodeXGLUE also features three baseline systems, including the BERT-style, GPT-style, and Encoder-Decoder models, to make it easy for researchers to use the platform. The availability of such data and baselines can help the development and validation of new methods that can be applied to various program understanding and generation problems.
Year
Venue
DocType
2021
Annual Conference on Neural Information Processing Systems
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
22
Name
Order
Citations
PageRank
Shuai Lu113219.25
Daya Guo264.81
Shuo Ren3102.48
Junjie Huang400.68
Alexey Svyatkovskiy551.45
Ambrosio Blanco601.01
Colin Clement701.01
Dawn Drain812.06
Daxin Jiang904.06
Duyu Tang1088336.98
Ge Li1100.34
Lidong Zhou1200.34
Linjun Shou131310.73
Long Zhou14203.01
Michele Tufano1502.03
Ming Gong161711.45
Ming Zhou174262251.74
Nan Duan1821345.87
Neel Sundaresan1984976.13
Shao Kun Deng2001.01
Shengyu Fu2100.68
Shujie Liu2233837.84