Abstract | ||
---|---|---|
We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork $\h$ is a neural network which learns to transform a simple noise distribution, $p(\vec\epsilon) = \N(\vec 0,\mat I)$, to a distribution $q(\pp) := q(h(\vec\epsilon))$ over the parameters $\pp$ of another neural network (the "primary network")\@. We train $q$ with variational inference, using an invertible $\h$ to enable efficient estimation of the variational lower bound on the posterior $p(\pp | \D)$ via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~$q(\pp)$. In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection. |
Year | Venue | DocType |
---|---|---|
2017 | CoRR | Journal |
Volume | Citations | PageRank |
abs/1710.04759 | 0 | 0.34 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
David Krueger | 1 | 200 | 11.17 |
Chin-Wei Huang | 2 | 8 | 5.18 |
Riashat Islam | 3 | 162 | 8.27 |
Turner, Ryan D. | 4 | 34 | 4.33 |
Alexandre Lacoste | 5 | 0 | 1.35 |
Aaron C. Courville | 6 | 6671 | 348.46 |