Applying classification techniques to remotely-collected program execution data - Citegraph

Paper Info

Title
Applying classification techniques to remotely-collected program execution data

Abstract
There is an increasing interest in techniques that support measurement and analysis of fielded software systems. One of the main goals of these techniques is to better understand how software actually behaves in the field. In particular, many of these techniques require a way to distinguish, in the field, failing from passing executions. So far, researchers and practitioners have only partially addressed this problem: they have simply assumed that program failure status is either obvious (i.e., the program crashes) or provided by an external source (e.g., the users). In this paper, we propose a technique for automatically classifying execution data, collected in the field, as coming from either passing or failing program runs. (Failing program runs are executions that terminate with a failure, such as a wrong outcome.) We use statistical learning algorithms to build the classification models. Our approach builds the models by analyzing executions performed in a controlled environment (e.g., test cases run in-house) and then uses the models to predict whether execution data produced by a fielded instance were generated by a passing or failing program execution. We also present results from an initial feasibility study, based on multiple versions of a software subject, in which we investigate several issues vital to the applicability of the technique. Finally, we present some lessons learned regarding the interplay between the reliability of classification models and the amount and type of data collected.

Year	DOI	Venue
2005	10.1145/1081706.1081732	Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering
Keywords	Field	DocType
software systems,feasibility study,data collection,reliability,machine learning,classification	Software behavior,Software engineering,Computer science,Software system,Real-time computing,Software,Test case,Statistical learning	Conference
Volume	Issue	ISSN
30	5	0163-5948
ISBN	Citations	PageRank
1-59593-014-0	33	1.39
References	Authors
13	5

Authors (5 rows)

Cited by (33 rows)

References (13 rows)

Name	Order	Citations	PageRank
Murali Haran	1	36	2.85
Alan F. Karr	2	1005	76.93
Alessandro Orso	3	3550	172.85
Adam Porter	4	2159	196.52
Ashish Sanil	5	152	12.81

1