Title
Monadic composition for deterministic, parallel batch processing
Abstract
Achieving determinism on real software systems remains difficult. Even a batch-processing job, whose task is to map input bits to output bits, risks nondeterminism from thread scheduling, system calls, CPU instructions, and leakage of environmental information such as date or CPU model. In this work, we present a system for achieving low-overhead deterministic execution of batch-processing programs that read and write the file system—turning them into pure functions on files. We allow multi-process executions where a permissions system prevents races on the file system. Process separation enables different processes to enforce permissions and enforce determinism using distinct mechanisms. Our prototype, DetFlow, allows a statically-typed coordinator process to use shared-memory parallelism, as well as invoking process-trees of sandboxed legacy binaries. DetFlow currently implements the coordinator as a Haskell program with a restricted I/O type for its main function: a new monad we call DetIO. Legacy binaries launched by the coordinator run concurrently, but internally each process schedules threads sequentially, allowing dynamic determinism-enforcement with predictably low overhead. We evaluate DetFlow by applying it to bioinformatics data pipelines and software build systems. DetFlow enables determinizing these data-processing workflows by porting a small amount of code to become a statically-typed coordinator. This hybrid approach of static and dynamic determinism enforcement permits freedom where possible but restrictions where necessary.
Year
DOI
Venue
2017
10.1145/3133897
Proceedings of the ACM on Programming Languages
Keywords
Field
DocType
Haskell,concurrency,deterministic parallelism
File system,Software build,Concurrency,Computer science,Thread (computing),Software system,Schedule,Haskell,Operating system,Monad (functional programming),Distributed computing
Journal
Volume
Issue
ISSN
1
OOPSLA
2475-1421
Citations 
PageRank 
References 
0
0.34
29
Authors
4
Name
Order
Citations
PageRank
Ryan G. Scott192.30
Omar S. Navarro Leija221.39
Joseph Devietti392.79
Ryan Newton480270.80