Abstract | ||
---|---|---|
With the universal application of software cluster systems, their reliability is drawing more and more attention from academia to industry. A cluster system is a kind of software load-sharing system (LSS) whose reliability is significantly dependent on system software. Therefore, traditional reliability analysis methods for hardware LSSs are not applicable for cluster systems. In this paper, we develop a reliability analysis model for redundant cluster systems consisting of initial servers and cold standby servers used to replace failed ones. System reliability process is modeled with a state-based non-homogeneous Markov process (NHMH), where each state corresponds to a non-homogeneous Poisson processe (NHPP). NHPP arrival rate is expressed using Cox's proportional hazard model (PHM) in terms of cumulative and instantaneous workload of system software. In addition to redundant cluster systems without repair, the model also can be extended to analyze those with restart. The analysis results are meaningful to support cluster management and design decisions. Finally, the evaluation experiments show the potential of our model. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/COMPSAC.2016.177 | 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC) |
Keywords | Field | DocType |
cluster system,load-sharing system,cumulative workload,software reliability,software aging | System software,Markov process,Computer science,Server,Real-time computing,Software,Software reliability testing,Software aging,Software quality,Software sizing,Reliability engineering | Conference |
Volume | ISSN | ISBN |
1 | 0730-3157 | 978-1-4673-8846-7 |
Citations | PageRank | References |
0 | 0.34 | 16 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chunyan Hou | 1 | 0 | 0.68 |
Chen Chen | 2 | 440 | 57.36 |
Jinsong Wang | 3 | 8 | 3.15 |
Kai Shi | 4 | 8 | 4.99 |