Title
Predicting Consistency-Maintenance Requirement of Code Clonesat Copy-and-Paste Time
Abstract
Code clones have always been a double edged sword in software development. On one hand, it is a very convenient way to reuse existing code, and to save coding effort. On the other hand, since developers may need to ensure consistency among cloned code segments, code clones can lead to extra maintenance effort and even bugs. Recently studies on the evolution of code clones show that only some of the code clones experience consistent changes during their evolution history. Therefore, if we can accurately predict whether a code clone will experience consistent changes, we will be able to provide useful recommendations to developers onleveraging the convenience of some code cloning operations, while avoiding other code cloning operations to reduce future consistency maintenance effort. In this paper, we define a code cloning operation as consistency-maintenance-required if its generated code clones experience consistent changes in the software evolution history, and we propose a novel approach that automatically predicts whether a code cloning operation requires consistency maintenance at the time point of performing copy-and-paste operations. Our insight is that whether a code cloning operation requires consistency maintenance may relate to the characteristics of the code to be cloned and the characteristics of its context. Based on a number of attributes extracted from the cloned code and the context of the code cloning operation, we use Bayesian Networks, a machine-learning technique, to predict whether an intended code cloning operation requires consistency maintenance. We evaluated our approach on four subjects-two large-scale Microsoft software projects, and two popular open-source software projects-under two usage scenarios: 1) recommend developers to perform only the cloning operations predicted to be very likely to be consistency-maintenance-free, and 2) recommend developers to perform all cloning operations unless they are predicted very likely to be consiste- cy-maintenance-required. In the first scenario, our approach is able to recommend developers to perform more than 50 percent cloning operations with a precision of at least 94 percent in the four subjects. In the second scenario, our approach is able to avoid 37 to 72 percent consistency-maintenance-required code clones by warning developers on only 13 to 40 percent code clones, in the four subjects.
Year
DOI
Venue
2014
10.1109/TSE.2014.2323972
IEEE Trans. Software Eng.
Keywords
Field
DocType
public domain software,belief networks,maintenance effort,learning (artificial intelligence),code clones,consistency-maintenance requirement,consistency maintenance effort,consistency maintenance,software development,software maintenance,code cloning operations,programming aid,code cloning,machine-learning technique,bayesian networks,microsoft software projects,copy-and-paste time,open-source software projects,maintenance engineering,history,cloning
Code coverage,Static program analysis,Programming language,Software engineering,Computer science,Source code,Real-time computing,Software maintenance,Software evolution,Code (cryptography),Software development,Code review
Journal
Volume
Issue
ISSN
40
8
0098-5589
Citations 
PageRank 
References 
5
0.42
0
Authors
6
Name
Order
Citations
PageRank
Xiaoyin Wang174929.19
Yingnong Dang253726.92
Lingming Zhang32726154.39
Dongmei Zhang41439132.94
Erica Lan5240.99
Hong Mei63535219.36