Title
Incremental origin analysis of source code files
Abstract
The history of software systems tracked by version control systems is often incomplete because many file movements are not recorded. However, static code analyses that mine the file history, such as change frequency or code churn, produce precise results only if the complete history of a source code file is available. In this paper, we show that up to 38.9% of the files in open source systems have an incomplete history, and we propose an incremental, commit-based approach to reconstruct the history based on clone information and name similarity. With this approach, the history of a file can be reconstructed across repository boundaries and thus provides accurate information for any source code analysis. We evaluate the approach in terms of correctness, completeness, performance, and relevance with a case study among seven open source systems and a developer survey.
Year
DOI
Venue
2014
10.1145/2597073.2597111
MSR
Keywords
Field
DocType
clone detection,algorithms,origin analysis,software evolution,metrics
Codebase,Static program analysis,Data mining,Computer science,Source code,Correctness,Software system,Redundant code,Class implementation file,KPI-driven code analysis,Database
Conference
Citations 
PageRank 
References 
8
0.43
19
Authors
3
Name
Order
Citations
PageRank
Daniela Steidl11035.59
Benjamin Hummel266029.51
Elmar Juergens374331.07