Title
High-Throughput Computing on High-Performance Platforms: A Case Study
Abstract
The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size re-source. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan - a DOE leadership facility in conjunction with traditional distributed high-throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.
Year
DOI
Venue
2017
10.1109/eScience.2017.43
2017 IEEE 13th International Conference on e-Science (e-Science)
Keywords
Field
DocType
high-performance and throughput computing
Large Hadron Collider,Executor,High-throughput computing,Computer science,Server,Computing systems,Scalability,Distributed computing
Conference
ISSN
ISBN
Citations 
2325-372X
978-1-5386-2687-0
3
PageRank 
References 
Authors
0.49
2
9
Name
Order
Citations
PageRank
Danila Oleynik130.82
S. Panitkin251.90
Matteo Turilli38416.21
Alessio Angius430.49
Sarp H. Oral531.16
Kaushik De617123.99
A. Klimentov752.57
Jack C. Wells8154.24
Shantenu Jha918832.40