Title
Defining a Software Maintainability Dataset: Collecting, Aggregating and Analysing Expert Evaluations of Software Maintainability
Abstract
Before controlling the quality of software systems, we need to assess it. In the case of maintainability, this often happens with manual expert reviews. Current automatic approaches have received criticism because their results often do not reflect the opinion of experts or are biased towards a small group of experts. We use the judgments of a significantly larger expert group to create a robust maintainability dataset. In a large scale survey, 70 professionals assessed code from 9 open and closed source Java projects with a combined size of 1.4 million source lines of code. The assessment covers an overall judgment as well as an assessment of several subdimensions of maintainability. Among these subdimensions, we present evidence that understandability is valued the most by the experts. Our analysis also reveals that disagreement between evaluators occurs frequently. Significant dissent was detected in 17% of the cases. To overcome these differences, we present a method to determine a consensus, i.e. the most probable true label. The resulting dataset contains the consensus of the experts for more than 500 Java classes. This corpus can be used to learn precise and practical classifiers for software maintainability.
Year
DOI
Venue
2020
10.1109/ICSME46990.2020.00035
2020 IEEE International Conference on Software Maintenance and Evolution (ICSME)
Keywords
DocType
ISSN
Software Maintenance,Software Quality,Machine Learning,Software Measurement
Conference
1063-6773
ISBN
Citations 
PageRank 
978-1-7281-5620-0
0
0.34
References 
Authors
28
3
Name
Order
Citations
PageRank
Markus Schnappinger132.41
Arnaud Fietzke231.40
Alexander Pretschner3269.69