Title
Experimental Analysis of a Gossip-Based Service for Scalable, Distributed Failure Detection and Consensus
Abstract
Gossip protocols and services provide a means by which failures can be detected in large, distributed systems in an asynchronous manner without the limits associated with reliable multicasting for group communications. Extending the gossip protocol such that a system reaches consensus on detected faults can be performed via a flat structure, or it can be hierarchically distributed across cooperating layers of nodes. In this paper, the performance of gossip services employing flat and hierarchical schemes is analyzed on an experimental testbed in terms of consensus time, resource utilization and scalability. Performance associated with a hierarchically arranged gossip scheme is analyzed with varying group sizes and is shown to scale well. Resource utilization of the gossip-style failure detection and consensus service is measured in terms of network bandwidth utilization and CPU utilization. Analytical models are developed for resource utilization and performance projections are made for large system sizes.
Year
DOI
Venue
2003
10.1023/A:1023592621046
Cluster Computing
Keywords
Field
DocType
cluster computing,consensus,failure detection,fault tolerance,gossip protocol,layering
Asynchronous communication,CPU time,Computer science,Gossip,Computer network,Real-time computing,Fault tolerance,Gossip protocol,Multicast,Computer cluster,Scalability,Distributed computing
Journal
Volume
Issue
ISSN
6
3
1573-7543
Citations 
PageRank 
References 
4
0.48
2
Authors
3
Name
Order
Citations
PageRank
Krishnakanth Sistla140.48
Alan George210924.60
Robert W. Todd3343.52