Abstract | ||
---|---|---|
We describe a simple, low-level approach for embedding probabilistic programming in a deep learning ecosystem. In particular, we distill probabilistic programming down to a single abstraction-the random variable. Our lightweight implementation in TensorFlow enables numerous applications: a model-parallel variational auto-encoder (vAE) with 2nd-generation tensor processing units (TPUv2s); a data-parallel autoregressive model (Image Transformer) with TPUv2s; and multi-GPU No-U-Turn Sampler (NUTS). For both a state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256 CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2 chips. With NUTS, we see a 100x speedup on GPUs over Stan and 37x over PyMC3.(1) |
Year | Venue | Keywords |
---|---|---|
2018 | ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | random variable,probabilistic programming,linear speedup |
DocType | Volume | ISSN |
Conference | 31 | 1049-5258 |
Citations | PageRank | References |
1 | 0.35 | 27 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
dustin tran | 1 | 201 | 17.24 |
Matt Hoffman | 2 | 227 | 14.27 |
Dave Moore | 3 | 1 | 2.38 |
Christopher Suter | 4 | 1 | 0.35 |
Srinivas Vasudevan | 5 | 1 | 0.69 |
Alexey Radul | 6 | 35 | 8.90 |
Matthew James Johnson | 7 | 4 | 5.67 |
Rif Saurous | 8 | 148 | 10.49 |