Title
The Collaborative Research Center FONDA.
Abstract
Today's scientific data analysis very often requires complex Data Analysis Workflows (DAWs) executed over distributed computational infrastructures, e.g., clusters. Much research effort is devoted to the tuning and performance optimization of specific workflows for specific clusters. However, an arguably even more important problem for accelerating research is the reduction of development, adaptation, and maintenance times of DAWs. We describe the design and setup of the Collaborative Research Center (CRC) 1404 "FONDA -- Foundations of Workflows for Large-Scale Scientific Data Analysis", in which roughly 50 researchers jointly investigate new technologies, algorithms, and models to increase the portability, adaptability, and dependability of DAWs executed over distributed infrastructures. We describe the motivation behind our project, explain its underlying core concepts, introduce FONDA's internal structure, and sketch our vision for the future of workflow-based scientific data analysis. We also describe some lessons learned during the "making of" a CRC in Computer Science with strong interdisciplinary components, with the aim to foster similar endeavors.
Year
DOI
Venue
2021
10.1007/s13222-021-00397-5
Datenbank-Spektrum
Keywords
DocType
Volume
Big data processing,Data science,Distributed systems,Research software engineering,Scientific workflows
Journal
21
Issue
ISSN
Citations 
3
1610-1995
0
PageRank 
References 
Authors
0.34
0
21