Abstract | ||
---|---|---|
The history of software systems tracked by version control systems is often incomplete because many file movements are not recorded. However, static code analyses that mine the file history, such as change frequency or code churn, produce precise results only if the complete history of a source code file is available. In this paper, we show that up to 38.9% of the files in open source systems have an incomplete history, and we propose an incremental, commit-based approach to reconstruct the history based on clone information and name similarity. With this approach, the history of a file can be reconstructed across repository boundaries and thus provides accurate information for any source code analysis. We evaluate the approach in terms of correctness, completeness, performance, and relevance with a case study among seven open source systems and a developer survey. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1145/2597073.2597111 | MSR |
Keywords | Field | DocType |
clone detection,algorithms,origin analysis,software evolution,metrics | Codebase,Static program analysis,Data mining,Computer science,Source code,Correctness,Software system,Redundant code,Class implementation file,KPI-driven code analysis,Database | Conference |
Citations | PageRank | References |
8 | 0.43 | 19 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daniela Steidl | 1 | 103 | 5.59 |
Benjamin Hummel | 2 | 660 | 29.51 |
Elmar Juergens | 3 | 743 | 31.07 |