Title
Benchmarking effort estimation models using archetypal analysis
Abstract
The research on software cost estimation has resulted not only to a large number of prediction methodologies and improvement techniques, but also to numerous methods for evaluating and comparing them. The identification of the best prediction model for a specific dataset is still an open issue since the evaluation of candidate models is essentially a multi-criteria problem. Model comparison usually involves statistical hypothesis tests with respect to a single criterion, while for multiple criteria, aggregating methods are usually employed. In the current study, we investigate the alternative approach of benchmarking, which is different from model comparison. The general idea is first to choose among the competitors few \"reference models\" with special, preferably divergent performance characteristics with respect to multiple criteria and then to examine the placement of all the other models in relation to the reference ones. For solving this problem, we utilize a multivariate statistical method, known as Archetypal Analysis (AA), which provides an appealing and intuitive approach for the identification of the reference models and the subsequent benchmarking of all the competitors. The competitor models are considered as points in a multi-dimensional space, defined by the prediction performance criteria, while AA locates the archetypes, i.e. the reference models which determine the convex hull of the swarm of all points (competitors). Apart from identifying reference models for benchmarking with superior or inferior predictive power according to several accuracy measures, the proposed methodology utilizes the similarity of a subset of models to a \"superior\" archetype in order to provide a mechanism for building ensembles. The proposed methodology is applied to a dataset containing performance measures of seventy five models which were initially trained and tested on 195 Web projects of the TUKUTUKU database. The application illustrates the straightforwardness and the intuitively attractive interpretation of the derived results.
Year
DOI
Venue
2014
10.1145/2639490.2639502
Promise
Keywords
Field
DocType
algorithms,experimentation,cost estimation,benchmarking,effort estimation,measurement,archetypal analysis,management,prediction models
Data mining,Reference model,Computer science,Convex hull,Cost estimate,Software,Artificial intelligence,Predictive modelling,Machine learning,Benchmarking,Statistical hypothesis testing,Competitor analysis
Conference
Citations 
PageRank 
References 
1
0.35
14
Authors
3
Name
Order
Citations
PageRank
Nikolaos Mittas123815.03
Vagia Karpenisi210.35
Lefteris Angelis3129682.51