Title
Identifying self-admitted technical debt through code comment analysis with a contextualized vocabulary
Abstract
Abstract Context Previous work has shown that one can explore code comments to detect Self-Admitted Technical Debt (SATD) using a contextualized vocabulary. However, current detection strategies still return a large number of false positives items. Moreover, those strategies do not allow the automatic identification of the type of debt of the identified items. Objective This work applies, evaluates, and improves a set of contextualized patterns we built to detect self-admitted technical debt using code comment analysis. We refer to this set of patterns as the self-admitted technical debt identification vocabulary. Method We carry out three empirical studies. Firstly, 23 participants analyze the patterns of a previously defined contextualized vocabulary and register their level of importance in identifying SATD items. Secondly, we perform a qualitative analysis to investigate the relation between each pattern and types of debt. Finally, we perform a feasibility study using a new vocabulary, improved based on the results of the previous empirical studies, to automatically identify self-admitted technical debt items, and types of debt, that exist in three open source projects. Results More than half of the new patterns were considered decisive or very decisive to detect technical debt items. The new vocabulary was able to find items associated to code, design, defect, documentation, and requirement debt. Thus, the result of the work is an improved vocabulary that considers the level of importance of each pattern and the relationship between patterns and debt types to support the identification and classification of SATD items. Conclusion The studies allowed us to improve a vocabulary to identify self-admitted technical debt items through code comments analysis. The results show that the use of pattern-based code comment analysis can contribute to improve existing methods, or create new ones, for automatically identifying and classifying technical debt items.
Year
DOI
Venue
2020
10.1016/j.infsof.2020.106270
Information and Software Technology
Keywords
Field
DocType
Technical debt,Self-admitted technical debt,Technical debt identification,Code comment analysis
Data mining,Comment analysis,Information retrieval,Computer science,Debt,Technical debt,Documentation,Vocabulary,Empirical research,False positive paradox
Journal
Volume
ISSN
Citations 
121
0950-5849
6
PageRank 
References 
Authors
0.45
0
4