Timezone: »
The stunning empirical successes of neural networks currently lack rigorous theoretical explanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class of activation functions (including all currently used), and inputs drawn from any logconcave distribution, there is a family of one-hidden-layer functions whose output is a sum gate that are hard to learn in a precise sense: any statistical query algorithm (which includes all known variants of stochastic gradient descent with any loss function) needs an exponential number of queries even using tolerance inversely proportional to the input dimensionality. Moreover, this hard family of functions is realizable with a small (sublinear in dimension) number of activation units in the single hidden layer. The lower bound is also robust to small perturbations of the true weights. Systematic experiments illustrate a phase transition in the training error as predicted by the analysis.
Author Information
Le Song (Ant Financial & Georgia Institute of Technology)
Santosh Vempala (Georgia Tech)
John Wilmes (Georgia Institute of Technology)
Bo Xie (Georgia Tech)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Spotlight: On the Complexity of Learning Neural Networks »
Tue Dec 5th 11:35 -- 11:40 PM Room Hall C
More from the Same Authors
-
2020 Poster: Understanding Deep Architecture with Reasoning Layer »
Xinshi Chen · Yufei Zhang · Christoph Reisinger · Le Song -
2020 Poster: The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models »
Yingxiang Yang · Negar Kiyavash · Le Song · Niao He -
2020 Spotlight: The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models »
Yingxiang Yang · Negar Kiyavash · Le Song · Niao He -
2019 Workshop: Learning with Temporal Point Processes »
Manuel Rodriguez · Le Song · Isabel Valera · Yan Liu · Abir De · Hongyuan Zha -
2019 Poster: Neural Similarity Learning »
Weiyang Liu · Zhen Liu · James Rehg · Le Song -
2019 Poster: Meta Architecture Search »
Albert Shaw · Wei Wei · Weiyang Liu · Le Song · Bo Dai -
2019 Poster: Exponential Family Estimation via Adversarial Dynamics Embedding »
Bo Dai · Zhen Liu · Hanjun Dai · Niao He · Arthur Gretton · Le Song · Dale Schuurmans -
2019 Poster: Retrosynthesis Prediction with Conditional Graph Logic Network »
Hanjun Dai · Chengtao Li · Connor Coley · Bo Dai · Le Song -
2018 Poster: Learning Loop Invariants for Program Verification »
Xujie Si · Hanjun Dai · Mukund Raghothaman · Mayur Naik · Le Song -
2018 Spotlight: Learning Loop Invariants for Program Verification »
Xujie Si · Hanjun Dai · Mukund Raghothaman · Mayur Naik · Le Song -
2018 Poster: Coupled Variational Bayes via Optimization Embedding »
Bo Dai · Hanjun Dai · Niao He · Weiyang Liu · Zhen Liu · Jianshu Chen · Lin Xiao · Le Song -
2018 Poster: Learning Temporal Point Processes via Reinforcement Learning »
Shuang Li · Shuai Xiao · Shixiang Zhu · Nan Du · Yao Xie · Le Song -
2018 Spotlight: Learning Temporal Point Processes via Reinforcement Learning »
Shuang Li · Shuai Xiao · Shixiang Zhu · Nan Du · Yao Xie · Le Song -
2018 Poster: Learning towards Minimum Hyperspherical Energy »
Weiyang Liu · Rongmei Lin · Zhen Liu · Lixin Liu · Zhiding Yu · Bo Dai · Le Song -
2017 Poster: Predicting User Activity Level In Point Processes With Mass Transport Equation »
Yichen Wang · Xiaojing Ye · Hongyuan Zha · Le Song -
2017 Poster: Learning Combinatorial Optimization Algorithms over Graphs »
Elias Khalil · Hanjun Dai · Yuyu Zhang · Bistra Dilkina · Le Song -
2017 Spotlight: Learning Combinatorial Optimization Algorithms over Graphs »
Elias Khalil · Hanjun Dai · Yuyu Zhang · Bistra Dilkina · Le Song -
2017 Poster: Deep Hyperspherical Learning »
Weiyang Liu · Yan-Ming Zhang · Xingguo Li · Zhiding Yu · Bo Dai · Tuo Zhao · Le Song -
2017 Spotlight: Deep Hyperspherical Learning »
Weiyang Liu · Yan-Ming Zhang · Xingguo Li · Zhiding Yu · Bo Dai · Tuo Zhao · Le Song -
2017 Poster: Wasserstein Learning of Deep Generative Point Process Models »
Shuai Xiao · Mehrdad Farajtabar · Xiaojing Ye · Junchi Yan · Xiaokang Yang · Le Song · Hongyuan Zha -
2016 Poster: Multistage Campaigning in Social Networks »
Mehrdad Farajtabar · Xiaojing Ye · Sahar Harati · Le Song · Hongyuan Zha -
2016 Poster: Coevolutionary Latent Feature Processes for Continuous-Time User-Item Interactions »
Yichen Wang · Nan Du · Rakshit Trivedi · Le Song -
2015 Poster: Time-Sensitive Recommendation From Recurrent User Activities »
Nan Du · Yichen Wang · Niao He · Jimeng Sun · Le Song -
2015 Poster: Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients »
Bo Xie · Yingyu Liang · Le Song -
2015 Poster: Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression »
Yu-Ying Liu · Shuang Li · Fuxin Li · Le Song · James Rehg -
2015 Poster: COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution »
Mehrdad Farajtabar · Yichen Wang · Manuel Rodriguez · Shuang Li · Hongyuan Zha · Le Song -
2015 Oral: COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution »
Mehrdad Farajtabar · Yichen Wang · Manuel Rodriguez · Shuang Li · Hongyuan Zha · Le Song -
2015 Poster: M-Statistic for Kernel Change-Point Detection »
Shuang Li · Yao Xie · Hanjun Dai · Le Song -
2014 Poster: Active Learning and Best-Response Dynamics »
Maria-Florina F Balcan · Christopher Berlind · Avrim Blum · Emma Cohen · Kaushik Patnaik · Le Song -
2014 Poster: Learning Time-Varying Coverage Functions »
Nan Du · Yingyu Liang · Maria-Florina F Balcan · Le Song -
2014 Poster: Shaping Social Activity by Incentivizing Users »
Mehrdad Farajtabar · Nan Du · Manuel Gomez Rodriguez · Isabel Valera · Hongyuan Zha · Le Song -
2014 Poster: Scalable Kernel Methods via Doubly Stochastic Gradients »
Bo Dai · Bo Xie · Niao He · Yingyu Liang · Anant Raj · Maria-Florina F Balcan · Le Song -
2013 Poster: Robust Low Rank Kernel Embeddings of Multivariate Distributions »
Le Song · Bo Dai -
2013 Poster: Scalable Influence Estimation in Continuous-Time Diffusion Networks »
Nan Du · Le Song · Manuel Gomez Rodriguez · Hongyuan Zha -
2013 Oral: Scalable Influence Estimation in Continuous-Time Diffusion Networks »
Nan Du · Le Song · Manuel Gomez Rodriguez · Hongyuan Zha -
2012 Workshop: Confluence between Kernel Methods and Graphical Models »
Le Song · Arthur Gretton · Alexander Smola -
2012 Workshop: Spectral Algorithms for Latent Variable Models »
Ankur P Parikh · Le Song · Eric Xing -
2012 Poster: Learning Networks of Heterogeneous Influence »
Nan Du · Le Song · Alexander Smola · Ming Yuan -
2012 Spotlight: Learning Networks of Heterogeneous Influence »
Nan Du · Le Song · Alexander Smola · Ming Yuan