Title
Maintaining quality of service with dynamic fault tolerance in fat-trees
Abstract
A very important ingredient in the computing landscape isUtility Computing Data Centres (UCDCs), large-scale computing systemsthat offer computational services to concurrently running jobsthrough virtual servers. As UCDC systems increase in size and the meantime between failure decreases, it is becoming an increasingly importantchallenge to expediently tolerate failures (dynamically), while distributingthe effects of the failure amongst the virtual servers according to theirservice level agreements. We propose and evaluate a strategy for offeringpredictable service in fat-trees experiencing faults, by reprioritisingpackets. The strategy is able to distribute the effect of network faults inorder to satisfy a number of quality-of-service demands. Which demandsto favour depends on the computer system and the characteristics of thejobs it is running, and in the presence of a moderate number of faults itis to some degree possible to meet the demands.
Year
DOI
Venue
2008
10.1007/978-3-540-89894-8_40
HiPC
Keywords
Field
DocType
computing landscape isutility computing,ucdc systems increase,moderate number,jobsthrough virtual server,failure decrease,computational service,virtual server,dynamic fault tolerance,data centres,faults itis,large-scale computing systemsthat offer,maintaining quality,wireless ad hoc network,energy efficiency,satisfiability,mean time between failure,multicast,fault tolerant,directional antenna,utility computing,approximation algorithm,quality of service
Mean time between failures,Computer science,Efficient energy use,Service-level agreement,Parallel computing,Computer network,Quality of service,Fault tolerance,Multicast,Wireless ad hoc network,Virtual channel,Distributed computing
Conference
Volume
ISSN
Citations 
5374
0302-9743
0
PageRank 
References 
Authors
0.34
13
2
Name
Order
Citations
PageRank
Frank Olaf Sem-Jacobsen1667.64
Tor Skeie2110374.67