Title | ||
---|---|---|
Decentralized Load Balancing for Improving Reliability in Heterogeneous Distributed Systems |
Abstract | ||
---|---|---|
A probabilistic analytical framework for decentralized load balancing (LB) strategies for heterogeneous distributed-computing systems (DCSs) is presented with the overall goal of maximizing the service reliability in the presence of random failures. The service reliability of a DCS is defined as the probability of successfully serving a specified workload before all the computing nodes fail permanently. In the framework considered the service and failure times of nodes are random, the communication times in the network are both tangible and stochastic, and LB is performed synchronously by all the nodes during the runtime of each submitted workload. By taking a novel regenerative stochastic-analysis approach, the service reliability of a two-node DCS is characterized analytically. This formulation, in turn, is used to form and solve an optimization problem, yielding LB policies with maximal reliability. A scalable extension of the two-node formulation to an arbitrary size system is also presented. The validity of the proposed theory is studied using both Monte-Carlo simulations and real experiments on a small-scale testbed. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/ICPPW.2009.50 | ICPP Workshops |
Keywords | Field | DocType |
arbitrary size system,load balancing i. introduction,reliability,-renewal theory,probabilistic analytical framework,dis- tributed computing,monte-carlo simulation,maximal reliability,lb policy,specified workload,two-node dcs,random failure,improving reliability,service reliability,queuing theory,two-node formulation,decentralized load,monte carlo methods,mathematical model,queueing theory,probability,reliability theory,probability density function,optimization problem,distributed computing,distributed processing,load balancing,monte carlo simulation,communication networks,stochastic analysis,software reliability,load balance,monte carlo simulations,stochastic processes,resource allocation,renewal theory | Load management,Telecommunications network,Load balancing (computing),Computer science,Parallel computing,Queueing theory,Probabilistic logic,Optimization problem,Reliability theory,Distributed computing,Scalability | Conference |
ISSN | Citations | PageRank |
1530-2016 | 4 | 0.42 |
References | Authors | |
9 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jorge E. Pezoa | 1 | 119 | 15.76 |
Sagar Dhakal | 2 | 72 | 4.20 |
Majeed M. Hayat | 3 | 213 | 26.36 |