CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation.

Paper Info

Title
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation.

Abstract
Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation. CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison. CodeXGLUE also features three baseline systems, including the BERT-style, GPT-style, and Encoder-Decoder models, to make it easy for researchers to use the platform. The availability of such data and baselines can help the development and validation of new methods that can be applied to various program understanding and generation problems.

Year	Venue	DocType
2021	Annual Conference on Neural Information Processing Systems	Conference
Citations	PageRank	References
0	0.34	0
Authors
22

Authors (22 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Shuai Lu	1	132	19.25
Daya Guo	2	6	4.81
Shuo Ren	3	10	2.48
Junjie Huang	4	0	0.68
Alexey Svyatkovskiy	5	5	1.45
Ambrosio Blanco	6	0	1.01
Colin Clement	7	0	1.01
Dawn Drain	8	1	2.06
Daxin Jiang	9	0	4.06
Duyu Tang	10	883	36.98
Ge Li	11	0	0.34
Lidong Zhou	12	0	0.34
Linjun Shou	13	13	10.73
Long Zhou	14	20	3.01
Michele Tufano	15	0	2.03
Ming Gong	16	17	11.45
Ming Zhou	17	4262	251.74
Nan Duan	18	213	45.87
Neel Sundaresan	19	849	76.13
Shao Kun Deng	20	0	1.01
Shengyu Fu	21	0	0.68
Shujie Liu	22	338	37.84