Title
Autonomous Orchestration of Distributed Discrete Event Simulations in the Presence of Resource Uncertainty
Abstract
Discrete event simulations model the behavior of complex, real-world systems. Simulating a wide range of events and conditions provides a more nuanced model, but also increases its computational footprint. To manage these processing requirements in a scalable manner, discrete event simulations can be distributed across multiple computing resources. Orchestrating the simulations in a distributed setting involves coping with resource uncertainty. We consider three key aspects of resource uncertainty: resource failures, heterogeneity, and slowdowns. Each of these aspects is managed autonomously, which involves making accurate predictions of future execution times and latencies while also accounting for differences in hardware capabilities and dynamic resource consumption profiles. Further complicating matters, individual tasks within the simulation are stateful and stochastic, requiring inter-task communication and synchronization to produce accurate outcomes. We deal with these challenges through intelligent state collection and migration, active resource monitoring, and empirical evaluation of resource capabilities under changing conditions. To underscore the viability of our solution, we provide benchmarks using a production discrete event simulation that can simultaneously sustain failures, manage resource heterogeneity, and handle slowdowns while being orchestrated by our framework.
Year
DOI
Venue
2015
10.1145/2746345
ACM Transactions on Autonomous and Adaptive Systems
Keywords
Field
DocType
Algorithms,Design,Performance,Reliability,Fault tolerance,distributed discrete event simulation,checkpointing,neural networks,prediction
Synchronization,Computer science,Real-time computing,Fault tolerance,Stateful firewall,Footprint,Artificial neural network,Orchestration (computing),Scalability,Distributed computing,Discrete event simulation
Journal
Volume
Issue
ISSN
10
3
1556-4665
Citations 
PageRank 
References 
1
0.36
19
Authors
4
Name
Order
Citations
PageRank
Zhiquan Sui191.51
Matthew Malensek29310.44
Neil Harvey3192.86
Shrideep Pallickara483792.72