Title
Retrieval-based neural source code summarization
Abstract
ABSTRACTSource code summarization aims to automatically generate concise summaries of source code in natural language texts, in order to help developers better understand and maintain source code. Traditional work generates a source code summary by utilizing information retrieval techniques, which select terms from original source code or adapt summaries of similar code snippets. Recent studies adopt Neural Machine Translation techniques and generate summaries from code snippets using encoder-decoder neural networks. The neural-based approaches prefer the high-frequency words in the corpus and have trouble with the low-frequency ones. In this paper, we propose a retrieval-based neural source code summarization approach where we enhance the neural model with the most similar code snippets retrieved from the training set. Our approach can take advantages of both neural and retrieval-based techniques. Specifically, we first train an attentional encoder-decoder model based on the code snippets and the summaries in the training set; Second, given one input code snippet for testing, we retrieve its two most similar code snippets in the training set from the aspects of syntax and semantics, respectively; Third, we encode the input and two retrieved code snippets, and predict the summary by fusing them during decoding. We conduct extensive experiments to evaluate our approach and the experimental results show that our proposed approach can improve the state-of-the-art methods.
Year
DOI
Venue
2020
10.1145/3377811.3380383
International Conference on Software Engineering
Keywords
DocType
ISSN
Source code summarization, Information retrieval, Deep neural network
Conference
0270-5257
Citations 
PageRank 
References 
16
0.55
0
Authors
5
Name
Order
Citations
PageRank
Jian Zhang17211.83
Xu Wang212016.42
Hongyu Zhang3182.66
Hailong Sun468064.83
Xudong Liu5769100.74