Title
Characterization of operational failures from a business data processing SaaS platform
Abstract
This paper characterizes operational failures of a production Custom Package Good Software-as-a-Service (SaaS) platform. Events log collected over 283 days of in-field operation are used to characterize platform failures. The characterization is performed by estimating (i) common failure types of the platform, (ii) key factors impacting platform failures, (iii) failure rate, and (iv) how user workload (files submitted for processing) impacts on the failure rate. The major findings are: (i) 34.1% of failures are caused by unexpected values in customers' data, (ii) nearly 33% of the failures are because of timeout, and (iii) the failure rate increases if the workload intensity (transactions/second) increases, while there is no statistical evidence of being influenced by the workload volume (size of users' data). Finally, the paper presents the lessons learned and how the findings and the implemented analysis tool allow platform developers to improve platform code, system settings and customer management.
Year
DOI
Venue
2014
10.1145/2591062.2591172
ICSE Companion
Keywords
Field
DocType
robustness,software/program verification,failure analysis,measurement,reliability,cloud computing,saas,logs,testing and debugging,performance
Customer relationship management,Workload,Business data processing,Computer science,Failure rate,Real-time computing,Robustness (computer science),Software as a service,Timeout,Operating system,Reliability engineering,Cloud computing
Conference
Citations 
PageRank 
References 
9
0.53
17
Authors
6
Name
Order
Citations
PageRank
Catello Di Martino121914.78
Zbigniew Kalbarczyk21896159.48
Ravishankar K. Iyer33489504.32
Geetika Goel4151.60
Santonu Sarkar531933.27
Rajeshwari Ganesan6747.65