Title
Dynamic Resource Management in a Cluster for High-Availability (Research Note)
Abstract
In order to execute high performance applications on a cluster, it is highly desirable to provide distributed services that globally manage physical resources distributed over the cluster nodes. However, as a distributed service may use resources located on different nodes, it becomes sensitive to changes in the cluster configuration due to node addition, reboot or failure. In this paper,w e propose a generic service performing dynamic resource management in a cluster in order to provide distributed services with high availability. This service has been implemented in the Gobelins cluster operating system. The dynamic resource management service we propose makes node addition and reboot nearly transparent to all distributed services of Gobelins and, as a consequence, fully transparent to applications. In the event of a node failure, applications using resources located on the failed node need to be restarted from a previously saved checkpoint but the availability of the cluster operating system is guaranteed, provided that its distributed services implement reconfiguration features.
Year
Venue
Keywords
2002
Euro-Par
node addition,node failure,cluster node,different node,gobelins cluster operating system,dynamic resource management,generic service,research note,failed node,cluster operating system,cluster configuration,dynamic resource management service,high availability
Field
DocType
ISBN
Resource management,Reboot,Computer science,Computer network,High availability,Control reconfiguration,Distributed computing,Distributed services
Conference
3-540-44049-6
Citations 
PageRank 
References 
1
0.36
7
Authors
3
Name
Order
Citations
PageRank
Pascal Gallard1958.70
Christine Morin243534.65
R. Lottiaux313311.98