Title
ReLink: recovering links between bugs and changes
Abstract
Software defect information, including links between bugs and committed changes, plays an important role in software maintenance such as measuring quality and predicting defects. Usually, the links are automatically mined from change logs and bug reports using heuristics such as searching for specific keywords and bug IDs in change logs. However, the accuracy of these heuristics depends on the quality of change logs. Bird et al. found that there are many missing links due to the absence of bug references in change logs. They also found that the missing links lead to biased defect information, and it affects defect prediction performance. We manually inspected the explicit links, which have explicit bug IDs in change logs and observed that the links exhibit certain features. Based on our observation, we developed an automatic link recovery algorithm, ReLink, which automatically learns criteria of features from explicit links to recover missing links. We applied ReLink to three open source projects. ReLink reliably identified links with 89% precision and 78% recall on average, while the traditional heuristics alone achieve 91% precision and 64% recall. We also evaluated the impact of recovered links on software maintainability measurement and defect prediction, and found the results of ReLink yields significantly better accuracy than those of traditional heuristics.
Year
DOI
Venue
2011
10.1145/2025113.2025120
SIGSOFT FSE
Keywords
Field
DocType
bug reference,change log,traditional heuristics,bug ids,committed change,defect information,missing link,explicit link,relink yield,defect prediction,software maintenance,data quality
Data mining,Data quality,Computer science,Software bug,Heuristics,Software maintenance,Maintainability
Conference
Citations 
PageRank 
References 
141
2.94
28
Authors
4
Search Limit
100141
Name
Order
Citations
PageRank
Rongxin Wu152819.69
Hongyu Zhang290137.18
Sunghun Kim33036114.11
S. C. Cheung42657162.89