Abstract | ||
---|---|---|
The emergence of Big Data applications provides new challenges in data management such as processing and movement of masses of data. Volunteer computing has proven itself as a distributed paradigm that can fully support Big Data generation. This paradigm uses a large number of heterogeneous and unreliable Internet-connected hosts to provide Peta-scale computing power for scientific projects. With the increase in data size and number of devices that can potentially join a volunteer computing project, the host bandwidth can become a main hindrance to the analysis of the data generated by these projects, especially if the analysis is a concurrent approach based on either in-situ or in-transit processing. In this paper, we propose a bandwidth model for volunteer computing projects based on the real trace data taken from the Docking@Home project with more than 280,000 hosts over a 5-year period. We validate the proposed statistical model using model-based and simulation-based techniques. Our modeling provides us with valuable insights on the concurrent integration of data generation with in-situ and in-transit analysis in the volunteer computing paradigm. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/PDCAT.2014.12 | 2014 15th International Conference on Parallel and Distributed Computing, Applications and Technologies |
Keywords | Field | DocType |
Volunteer Computing,Big Data,Internet Bandwidth,Statistical Modeling | Data science,Programming with Big Data in R,Computer science,Bandwidth (signal processing),Statistical model,Data management,Big data,Volunteer computing,Test data generation,Distributed computing | Conference |
ISSN | Citations | PageRank |
2379-5352 | 1 | 0.39 |
References | Authors | |
16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
bahman javadi | 1 | 666 | 40.59 |
boyu zhang | 2 | 71 | 17.54 |
michela taufer | 3 | 352 | 53.04 |