Title
Low-cost fault-tolerance protocol for large-scale network monitoring
Abstract
Distributed hierarchical network monitoring model has been proposed to solve scalability problem of centralized model. In this distributed model, a top-level monitoring manager, called main manager, obtains aggregate management information from mid-level managers, named domain managers, forming a hierarchical structure. However, if some of monitoring managers crash, network elements cannot be continuously and correctly monitored until the managers are repaired. To address this important, but previously unresolved issue, this paper presents a new fault-tolerance protocol for domain managers, named DMFTP, allowing the managers to efficiently utilize their organization structure. Therefore, this protocol can minimize failure detection overhead and the number of live managers affected by each manager node crash. Also, it tolerates concurrent manager failures and, after the failed managers have been repaired, ensures their immediate and consistent recovery.
Year
DOI
Venue
2003
10.1007/3-540-44863-2_50
International Conference on Computational Science
Keywords
Field
DocType
mid-level manager,failed manager,managers crash,concurrent manager failure,centralized model,top-level monitoring manager,manager node crash,main manager,low-cost fault-tolerance protocol,domain manager,large-scale network monitoring,live manager,fault tolerant,network monitoring
Crash,Management information systems,Distributed element model,Organizational structure,Computer science,Fault tolerance,Network element,Network monitoring,Distributed computing,Scalability
Conference
Volume
ISSN
ISBN
2659
0302-9743
3-540-40196-2
Citations 
PageRank 
References 
0
0.34
3
Authors
4
Name
Order
Citations
PageRank
Jinho Ahn18327.05
Sung-Gi Min211524.64
Youngil Choi3446.98
Byung Sun Lee47113.15