Abstract | ||
---|---|---|
Error data collected at runtime play a key role for dependability analysis and improvement of software systems. The use of monitoring frameworks for legacy mission-critical systems is hindered by limited intervention degree and low intrusiveness requirements. We present the design and experimentation of an error monitoring service for a legacy large-scale critical system in the Air Traffic Control (ATC) domain. We describe the details of the API realized to collect both direct data (event logs, execution traces) and indirect data (system resources' utilization). We present experiments with the ATC industrial case study, showing the efficacy of combining different data sources for error detection and propagation analysis, with an acceptable overhead at high monitoring rates for such a class of systems. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/DSN-W.2016.41 | 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop (DSN-W) |
Keywords | Field | DocType |
Error monitoring,event logging,critical systems | Kernel (linear algebra),Data collection,Computer science,Mission critical systems,Air traffic control,Critical system,Intrusiveness,Software system,Error detection and correction,Real-time computing,Reliability engineering,Distributed computing | Conference |
ISSN | ISBN | Citations |
2325-6648 | 978-1-5090-3688-2 | 0 |
PageRank | References | Authors |
0.34 | 9 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Marcello Cinque | 1 | 286 | 33.58 |
raffaele della corte | 2 | 28 | 6.30 |
Stefano Russo | 3 | 728 | 78.07 |