Bandwidth Modeling in Large Distributed Systems for Big Data Applications - Citegraph

Paper Info

Title
Bandwidth Modeling in Large Distributed Systems for Big Data Applications

Abstract
The emergence of Big Data applications provides new challenges in data management such as processing and movement of masses of data. Volunteer computing has proven itself as a distributed paradigm that can fully support Big Data generation. This paradigm uses a large number of heterogeneous and unreliable Internet-connected hosts to provide Peta-scale computing power for scientific projects. With the increase in data size and number of devices that can potentially join a volunteer computing project, the host bandwidth can become a main hindrance to the analysis of the data generated by these projects, especially if the analysis is a concurrent approach based on either in-situ or in-transit processing. In this paper, we propose a bandwidth model for volunteer computing projects based on the real trace data taken from the Docking@Home project with more than 280,000 hosts over a 5-year period. We validate the proposed statistical model using model-based and simulation-based techniques. Our modeling provides us with valuable insights on the concurrent integration of data generation with in-situ and in-transit analysis in the volunteer computing paradigm.

Year	DOI	Venue
2014	10.1109/PDCAT.2014.12	2014 15th International Conference on Parallel and Distributed Computing, Applications and Technologies
Keywords	Field	DocType
Volunteer Computing,Big Data,Internet Bandwidth,Statistical Modeling	Data science,Programming with Big Data in R,Computer science,Bandwidth (signal processing),Statistical model,Data management,Big data,Volunteer computing,Test data generation,Distributed computing	Conference
ISSN	Citations	PageRank
2379-5352	1	0.39
References	Authors
16	3

Authors (3 rows)

Cited by (1 rows)

References (16 rows)

Name	Order	Citations	PageRank
bahman javadi	1	666	40.59
boyu zhang	2	71	17.54
michela taufer	3	352	53.04

1