Residual Networks are Exponential Ensembles of Relatively Shallow Networks. - Citegraph

Paper Info

Title
Residual Networks are Exponential Ensembles of Relatively Shallow Networks.

Abstract
In this work, we introduce a novel interpretation of residual networks showing they are exponential ensembles. This observation is supported by a large-scale lesion study that demonstrates they behave just like ensembles at test time. Subsequently, we perform an analysis showing these ensembles mostly consist of networks that are each relatively shallow. For example, contrary to our expectations, most of the gradient in a residual network with 110 layers comes from an ensemble of very short networks, i.e., only 10-34 layers deep. This suggests that in addition to describing neural networks in terms of width and depth, there is a third dimension: multiplicity, the size of the implicit ensemble. Ultimately, residual networks do not resolve the vanishing gradient problem by preserving gradient flow throughout the entire depth of the network - rather, they avoid the problem simply by ensembling many short networks together. This insight reveals that depth is still an open research question and invites the exploration of the related notion of multiplicity.

Year	Venue	Field
2016	arXiv: Computer Vision and Pattern Recognition	Residual,Exponential function,Computer science,Multiplicity (mathematics),Artificial intelligence,Artificial neural network,Balanced flow,Vanishing gradient problem,Machine learning
DocType	Volume	Citations
Journal	abs/1605.06431	13
PageRank	References	Authors
0.75	3	3

Authors (3 rows)

Cited by (13 rows)

References (3 rows)

Name	Order	Citations	PageRank
Andreas Veit	1	50	4.85
Michael J. Wilber	2	86	7.37
Serge J. Belongie	3	12512	1010.13

1