Title
Mapping Bug Reports to Relevant Source Code Files Based on the Vector Space Model and Word Embedding.
Abstract
Although software bug localization in software maintenance and evolution is cumbersome and time-consuming, it is also very important, especially for large-scale software projects. To lighten the workload of developers, researchers have developed various information retrieval (IR)-based bug localization models for automated software support. In this paper, we propose a new method that reduces the time required for bug localization. First, the surface lexical similarity between a bug report and source code file is calculated based on the vector space model. Second, to address the lexical gap between the programming language and natural language, the word vector is used to calculate the semantic similarity between the bug report and source code file. Then, we use surface lexical and semantic similarity to calculate the total similarity for detecting buggy source code files. Our experimental word vectors are derived from Skip-gram and GloVe model training. We select an optimal 100 dimensional word vector for bug localization by evaluating it on four open source software examples. Finally, our experimental results show that our method outperforms classical IR-based methods in locating relevant source code files based on several indicators.
Year
DOI
Venue
2019
10.1109/ACCESS.2019.2922686
IEEE ACCESS
Keywords
Field
DocType
Bug localization,information retrieval,surface lexical similarity,semantic similarity,bug report,word embedding
Source code,Computer science,Theoretical computer science,Word embedding,Vector space model,Distributed computing
Journal
Volume
ISSN
Citations 
7
2169-3536
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Guangliang Liu100.68
Lu Yang292.88
Ke Shi372.58
Jingfei Chang401.35
Xing Wei5177.15