Timezone: »
When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse'" phenomenon. More specifically, for the output features of the penultimate layer, for each class the within-class features converge to their means, and the means of different classes exhibit a certain tight frame structure, which is also aligned with the last layer's classifier. As feature normalization in the last layer becomes a common practice in modern representation learning, in this work we theoretically justify the neural collapse phenomenon under normalized features. Based on an unconstrained feature model, we simplify the empirical loss function in a multi-class classification task into a nonconvex optimization problem over the Riemannian manifold by constraining all features and classifiers over the sphere. In this context, we analyze the nonconvex landscape of the Riemannian optimization problem over the product of spheres, showing a benign global landscape in the sense that the only global minimizers are the neural collapse solutions while all other critical points are strict saddle points with negative curvature. Experimental results on practical deep networks corroborate our theory and demonstrate that better representations can be learned faster via feature normalization. Code for our experiments can be found at https://github.com/cjyaras/normalized-neural-collapse.
Author Information
Can Yaras (University of Michigan - Ann Arbor)
Peng Wang (University of Michigan - Ann Arbor)
Zhihui Zhu (The Ohio State University)
Laura Balzano (University of Michigan-Ann Arbor)
Qing Qu (University of Michigan)
More from the Same Authors
-
2021 Spotlight: A Geometric Analysis of Neural Collapse with Unconstrained Features »
Zhihui Zhu · Tianyu Ding · Jinxin Zhou · Xiao Li · Chong You · Jeremias Sulam · Qing Qu -
2022 : Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning »
Shuo Xie · Jiahao Qiu · Ankita Pasad · Li Du · Qing Qu · Hongyuan Mei -
2022 : Linear Convergence Analysis of Neural Collapse with Unconstrained Features »
Peng Wang · Huikang Liu · Can Yaras · Laura Balzano · Qing Qu -
2022 : Poster Session 2 »
Jinwuk Seok · Bo Liu · Ryotaro Mitsuboshi · David Martinez-Rubio · Weiqiang Zheng · Ilgee Hong · Chen Fan · Kazusato Oko · Bo Tang · Miao Cheng · Aaron Defazio · Tim G. J. Rudner · Gabriele Farina · Vishwak Srinivasan · Ruichen Jiang · Peng Wang · Jane Lee · Nathan Wycoff · Nikhil Ghosh · Yinbin Han · David Mueller · Liu Yang · Amrutha Varshini Ramesh · Siqi Zhang · Kaifeng Lyu · David Yunis · Kumar Kshitij Patel · Fangshuo Liao · Dmitrii Avdiukhin · Xiang Li · Sattar Vakili · Jiaxin Shi -
2022 Poster: Are All Losses Created Equal: A Neural Collapse Perspective »
Jinxin Zhou · Chong You · Xiao Li · Kangning Liu · Sheng Liu · Qing Qu · Zhihui Zhu -
2022 Poster: Error Analysis of Tensor-Train Cross Approximation »
Zhen Qin · Alexander Lidiak · Zhexuan Gong · Gongguo Tang · Michael B Wakin · Zhihui Zhu -
2022 Poster: Revisiting Sparse Convolutional Model for Visual Recognition »
xili dai · Mingyang Li · Pengyuan Zhai · Shengbang Tong · Xingjian Gao · Shao-Lun Huang · Zhihui Zhu · Chong You · Yi Ma -
2021 Poster: A Geometric Analysis of Neural Collapse with Unconstrained Features »
Zhihui Zhu · Tianyu Ding · Jinxin Zhou · Xiao Li · Chong You · Jeremias Sulam · Qing Qu -
2021 Poster: Only Train Once: A One-Shot Neural Network Training And Pruning Framework »
Tianyi Chen · Bo Ji · Tianyu Ding · Biyi Fang · Guanyi Wang · Zhihui Zhu · Luming Liang · Yixin Shi · Sheng Yi · Xiao Tu -
2021 Poster: Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery »
Lijun Ding · Liwei Jiang · Yudong Chen · Qing Qu · Zhihui Zhu -
2021 Poster: Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training »
Sheng Liu · Xiao Li · Simon Zhai · Chong You · Zhihui Zhu · Carlos Fernandez-Granda · Qing Qu -
2020 Poster: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization »
Chong You · Zhihui Zhu · Qing Qu · Yi Ma -
2020 Spotlight: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization »
Chong You · Zhihui Zhu · Qing Qu · Yi Ma -
2019 Poster: Distributed Low-rank Matrix Factorization With Exact Consensus »
Zhihui Zhu · Qiuwei Li · Xinshuo Yang · Gongguo Tang · Michael B Wakin -
2019 Poster: A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution »
Qing Qu · Xiao Li · Zhihui Zhu -
2019 Spotlight: A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution »
Qing Qu · Xiao Li · Zhihui Zhu -
2019 Poster: A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning »
Zhihui Zhu · Tianyu Ding · Daniel Robinson · Manolis Tsakiris · RenĂ© Vidal -
2018 Poster: Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms »
Zhihui Zhu · Yifan Wang · Daniel Robinson · Daniel Naiman · RenĂ© Vidal · Manolis Tsakiris -
2018 Poster: Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization »
Zhihui Zhu · Xiao Li · Kai Liu · Qiuwei Li -
2015 Poster: Matrix Completion Under Monotonic Single Index Models »
Ravi Ganti · Laura Balzano · Rebecca Willett