Timezone: »
Invited speaker: The Convexity of Learning Infinite-width Deep Neural Networks, Tong Zhang
Tong Zhang
Deep learning has received considerable empirical successes in recent years. Although deep neural networks (DNNs) are highly nonconvex with respect to the model parameters, it has been observed that the training of overparametrized DNNs leads to consistent solutions that are highly reproducible with different random initializations.
I will explain this phenomenon by modeling DNNs using feature representations, and show that the optimization landscape is convex with respect to the features. Moreover, we show that optimization with respect to the nonconvex DNN parameters leads to a global optimal solution under an idealized regularity condition, which can explain various empirical findings.
Author Information
Tong Zhang (Tencent AI Lab)
More from the Same Authors
-
2022 : A Neural Tangent Kernel Perspective on Function-Space Regularization in Neural Networks »
Zonghao Chen · Xupeng Shi · Tim G. J. Rudner · Qixuan Feng · Weizhong Zhang · Tong Zhang -
2022 : Particle-based Variational Inference with Preconditioned Functional Gradient Flow »
Hanze Dong · Xi Wang · Yong Lin · Tong Zhang -
2022 : Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint »
Hao Liu · Minshuo Chen · Siawpeng Er · Wenjing Liao · Tong Zhang · Tuo Zhao -
2023 Poster: Double Randomized Underdamped Langevin with Dimension-Independent Convergence Guarantee »
Yuanshi Liu · Cong Fang · Tong Zhang -
2023 Poster: Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage »
Jose Blanchet · Miao Lu · Tong Zhang · Han Zhong -
2023 Poster: Posterior Sampling for Competitive RL: Function Approximation and Partial Observation »
Shuang Qiu · Ziyu Dai · Han Zhong · Zhaoran Wang · Zhuoran Yang · Tong Zhang -
2023 Poster: Corruption-Robust Offline Reinforcement Learning with General Function Approximation »
Chenlu Ye · Rui Yang · Quanquan Gu · Tong Zhang -
2023 Poster: Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training »
Rie Johnson · Tong Zhang -
2023 Poster: A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes »
Han Zhong · Tong Zhang -
2022 Poster: When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint »
Yoav S Freund · Yi-An Ma · Tong Zhang -
2022 Poster: Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity »
Alekh Agarwal · Tong Zhang -
2022 Poster: Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions »
Jiafan He · Dongruo Zhou · Tong Zhang · Quanquan Gu -
2021 : HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning »
Ziniu Li · Yingru Li · Yushun Zhang · Tong Zhang · Zhiquan Luo -
2021 : HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning »
Ziniu Li · Yingru Li · Yushun Zhang · Tong Zhang · Zhiquan Luo -
2021 Poster: A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning »
Christoph Dann · Mehryar Mohri · Tong Zhang · Julian Zimmert -
2021 Poster: Efficient Neural Network Training via Forward and Backward Propagation Sparsification »
Xiao Zhou · Weizhong Zhang · Zonghao Chen · SHIZHE DIAO · Tong Zhang -
2021 Poster: Error Compensated Distributed SGD Can Be Accelerated »
Xun Qian · Peter Richtarik · Tong Zhang -
2020 Poster: Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets »
Kai Han · Yunhe Wang · Qiulin Zhang · Wei Zhang · Chunjing XU · Tong Zhang -
2020 Poster: A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks »
Zixiang Chen · Yuan Cao · Quanquan Gu · Tong Zhang -
2020 Poster: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts »
Guilin Li · Junlei Zhang · Yunhe Wang · Chuanjian Liu · Matthias Tan · Yunfeng Lin · Wei Zhang · Jiashi Feng · Tong Zhang -
2020 Poster: Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems »
Luo Luo · Haishan Ye · Zhichao Huang · Tong Zhang -
2020 Poster: Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS »
Han Shi · Renjie Pi · Hang Xu · Zhenguo Li · James Kwok · Tong Zhang -
2020 Poster: Decentralized Accelerated Proximal Gradient Descent »
Haishan Ye · Ziang Zhou · Luo Luo · Tong Zhang -
2020 Poster: How to Characterize The Landscape of Overparameterized Convolutional Neural Networks »
Yihong Gu · Weizhong Zhang · Cong Fang · Jason Lee · Tong Zhang -
2019 Poster: Divergence-Augmented Policy Optimization »
Qing Wang · Yingru Li · Jiechao Xiong · Tong Zhang -
2018 Poster: Communication Compression for Decentralized Training »
Hanlin Tang · Shaoduo Gan · Ce Zhang · Tong Zhang · Ji Liu -
2018 Poster: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Spotlight: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Poster: Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity »
Conghui Tan · Tong Zhang · Shiqian Ma · Ji Liu -
2018 Poster: Exponentially Weighted Imitation Learning for Batched Historical Data »
Qing Wang · Jiechao Xiong · Lei Han · peng sun · Han Liu · Tong Zhang -
2018 Poster: Gradient Sparsification for Communication-Efficient Distributed Optimization »
Jianqiao Wangni · Jialei Wang · Ji Liu · Tong Zhang -
2017 Poster: Diffusion Approximations for Online Principal Component Estimation and Global Convergence »
Chris Junchi Li · Mengdi Wang · Tong Zhang -
2017 Oral: Diffusion Approximations for Online Principal Component Estimation and Global Convergence »
Chris Junchi Li · Mengdi Wang · Tong Zhang -
2017 Poster: On Quadratic Convergence of DC Proximal Newton Algorithm in Nonconvex Sparse Learning »
Xingguo Li · Lin Yang · Jason Ge · Jarvis Haupt · Tong Zhang · Tuo Zhao -
2016 Poster: Exact Recovery of Hard Thresholding Pursuit »
Xiaotong Yuan · Ping Li · Tong Zhang -
2016 Poster: Learning Additive Exponential Family Graphical Models via $\ell_{2,1}$-norm Regularized M-Estimation »
Xiaotong Yuan · Ping Li · Tong Zhang · Qingshan Liu · Guangcan Liu -
2015 Poster: Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling »
Zheng Qu · Peter Richtarik · Tong Zhang -
2015 Poster: Local Smoothness in Variance Reduced Optimization »
Daniel Vainsencher · Han Liu · Tong Zhang -
2015 Poster: Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding »
Rie Johnson · Tong Zhang -
2015 Spotlight: Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding »
Rie Johnson · Tong Zhang -
2013 Poster: Accelerating Stochastic Gradient Descent using Predictive Variance Reduction »
Rie Johnson · Tong Zhang -
2013 Poster: Accelerated Mini-Batch Stochastic Dual Coordinate Ascent »
Shai Shalev-Shwartz · Tong Zhang -
2012 Workshop: Modern Nonparametric Methods in Machine Learning »
Sivaraman Balakrishnan · Arthur Gretton · Mladen Kolar · John Lafferty · Han Liu · Tong Zhang -
2012 Poster: Selective Labeling via Error Bound Minimization »
Quanquan Gu · Tong Zhang · Chris Ding · Jiawei Han -
2011 Poster: Learning to Search Efficiently in High Dimensions »
Zhen Li · Huazhong Ning · Liangliang Cao · Tong Zhang · Yihong Gong · Thomas S Huang -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2011 Poster: Greedy Model Averaging »
Dong Dai · Tong Zhang -
2010 Poster: Deep Coding Network »
Yuanqing Lin · Tong Zhang · Shenghuo Zhu · Kai Yu -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Poster: Nonlinear Learning using Local Coordinate Coding »
Kai Yu · Tong Zhang · Yihong Gong -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models »
Tong Zhang -
2008 Oral: Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models »
Tong Zhang -
2008 Poster: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Spotlight: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Poster: Multi-stage Convex Relaxation for Learning with Sparse Regularization »
Tong Zhang -
2007 Poster: A General Boosting Method and its Application to Learning Ranking Functions for Web Search »
Zhaohui Zheng · Hongyuan Zha · Tong Zhang · Olivier Chapelle · Keke Chen · Gordon Sun -
2007 Poster: The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information »
John Langford · Tong Zhang -
2006 Poster: Learning on Graph with Laplacian Regularization »
Rie Ando · Tong Zhang