Title
Convex Formulation of Overparameterized Deep Neural Networks
Abstract
The analysis of over-parameterized neural networks has drawn significant attention in recent years. It was shown that such systems behave like convex systems under various restricted settings, such as for two-layer neural networks, and when learning is only restricted locally in the so-called neural tangent kernel space around specialized initializations. However, there is a lack of powerful theoretical techniques that can analyze fully trained deep neural networks under general conditions. This paper considers this fundamental problem by investigating such overparameterized deep neural networks when fully trained. Specifically, we characterize a deep neural network by its features’ distributions and propose a metric to intuitively measure the usefulness of feature representations. Under certain regularizers that bounds the metric, we show deep neural networks can be reformulated as a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">convex</i> optimization and the system can guarantee effective feature representations in terms of the metric. Our new analysis is more consistent with empirical observations that deep neural networks are capable of learning efficient feature representations. Empirical studies confirm that predictions of our theory are consistent with results observed in practice.
Year
DOI
Venue
2022
10.1109/TIT.2022.3163341
IEEE Transactions on Information Theory
Keywords
DocType
Volume
Deep learning,convex reformulation,feature representation
Journal
68
Issue
ISSN
Citations 
8
0018-9448
0
PageRank 
References 
Authors
0.34
9
4
Name
Order
Citations
PageRank
Cong Fang1177.14
Yihong Gu271.81
Weizhong Zhang3319.58
Zhang, Tong47126611.43