Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

Libin Zhu · Chaoyue Liu · Misha Belkin

Hall J #934

Keywords: [ Directed Acyclic Graph ] [ wide neural networks ] [ Neural Tangent Kernel ] [ over-parameterization ] [ transition to linearity ]


In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their ``width'' approaches infinity. The width of these general networks is characterized by the minimum in-degree of their neurons, except for the input and first layers. Our results identify the mathematical structure underlying transition to linearity and generalize a number of recent works aimed at characterizing transition to linearity or constancy of the Neural Tangent Kernel for standard architectures.

Chat is not available.