Timezone: »
We study the multiple manifold problem, a binary classification task modeled on applications in machine vision, in which a deep fully-connected neural network is trained to separate two low-dimensional submanifolds of the unit sphere. We provide an analysis of the one-dimensional case, proving for a simple manifold configuration that when the network depth L is large relative to certain geometric and statistical properties of the data, the network width n grows as a sufficiently large polynomial in L, and the number of i.i.d. samples from the manifolds is polynomial in L, randomly-initialized gradient descent rapidly learns to classify the two manifolds perfectly with high probability. Our analysis demonstrates concrete benefits of depth and width in the context of a practically-motivated model problem: the depth acts as a fitting resource, with larger depths corresponding to smoother networks that can more readily separate the class manifolds, and the width acts as a statistical resource, enabling concentration of the randomly-initialized network and its gradients. Along the way, we establish essentially optimal nonasymptotic rates of concentration for the neural tangent kernel of deep fully-connected ReLU networks using martingale techniques, requiring width n \geq L poly(d0) to achieve uniform concentration of the initial kernel over a d0-dimensional submanifold of the unit sphere. Our approach should be of use in establishing similar results for other network architectures.
Author Information
Samuel Buchanan (Columbia University)
Dar Gilboa (Columbia University)
John Wright (Columbia University)
More from the Same Authors
-
2021 Poster: Estimating the Unique Information of Continuous Variables »
Ari Pakman · Amin Nejatbakhsh · Dar Gilboa · Abdullah Makkeh · Luca Mazzucato · Michael Wibral · Elad Schneidman -
2021 Poster: Deep Networks Provably Classify Data on Curves »
Tingran Wang · Sam Buchanan · Dar Gilboa · John Wright -
2019 Poster: A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off »
Yaniv Blumenfeld · Dar Gilboa · Daniel Soudry -
2018 Poster: Structured Local Minima in Sparse Blind Deconvolution »
Yuqian Zhang · Han-wen Kuo · John Wright -
2017 Poster: Convolutional Phase Retrieval »
Qing Qu · Yuqian Zhang · Yonina Eldar · John Wright