Timezone: »
It has been empirically observed that training large models with weighted cross-entropy (CE) beyond the zero-training-error regime is not a satisfactory remedy for label-imbalanced data. Instead, researchers have proposed the vector-scaling (VS) loss, as a parameterization of the CE loss that is tailored to this modern training regime. The driving force to understand the impact of such parameterizations on the gradient-descent path has been the theory of implicit bias. Specifically for linear(ized) models, this theory allows to explain why weighted CE fails and how the VS-loss biases the optimization path towards solutions that favor minorities. However, beyond linear models the description of implicit bias is more obscure. In order to gain insights on the impact of different CE-parameterizations in non-linear models, we investigate their implicit geometry of learnt classifiers and embeddings. Our main result characterizes the global minimizers of a non-convex cost-sensitive SVM classifier for the so-called unconstrained features model, which serves as an abstraction of deep models. We also study empirically the convergence of SGD to this global minimizer observing slow-downs with increasing imbalance ratios and scalings of the loss hyperparameters.
Author Information
Tina Behnia (University of British Columbia)
Ganesh Ramachandra Kini (UC Santa Barbara)
Vala Vakilian (University of British Columbia)
Christos Thrampoulidis (University of British Columbia)
More from the Same Authors
-
2022 : Generalization of Decentralized Gradient Descent with Separable Data »
Hossein Taheri · Christos Thrampoulidis -
2022 : Fast Convergence of Random Reshuffling under Interpolation and the Polyak-Łojasiewicz Condition »
Chen Fan · Christos Thrampoulidis · Mark Schmidt -
2023 Poster: BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization »
Chen Fan · Gaspard Choné-Ducasse · Mark Schmidt · Christos Thrampoulidis -
2022 : Poster Session 1 »
Andrew Lowy · Thomas Bonnier · Yiling Xie · Guy Kornowski · Simon Schug · Seungyub Han · Nicolas Loizou · xinwei zhang · Laurent Condat · Tabea E. Röber · Si Yi Meng · Marco Mondelli · Runlong Zhou · Eshaan Nichani · Adrian Goldwaser · Rudrajit Das · Kayhan Behdin · Atish Agarwala · Mukul Gagrani · Gary Cheng · Tian Li · Haoran Sun · Hossein Taheri · Allen Liu · Siqi Zhang · Dmitrii Avdiukhin · Bradley Brown · Miaolan Xie · Junhyung Lyle Kim · Sharan Vaswani · Xinmeng Huang · Ganesh Ramachandra Kini · Angela Yuan · Weiqiang Zheng · Jiajin Li -
2022 Poster: Imbalance Trouble: Revisiting Neural-Collapse Geometry »
Christos Thrampoulidis · Ganesh Ramachandra Kini · Vala Vakilian · Tina Behnia -
2022 Poster: Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently »
Haoyuan Sun · Kwangjun Ahn · Christos Thrampoulidis · Navid Azizan -
2021 Poster: AutoBalance: Optimized Loss Functions for Imbalanced Data »
Mingchen Li · Xuechen Zhang · Christos Thrampoulidis · Jiasi Chen · Samet Oymak -
2021 Poster: UCB-based Algorithms for Multinomial Logistic Regression Bandits »
Sanae Amani · Christos Thrampoulidis -
2021 Poster: Label-Imbalanced and Group-Sensitive Classification under Overparameterization »
Ganesh Ramachandra Kini · Orestis Paraskevas · Samet Oymak · Christos Thrampoulidis -
2021 Poster: Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation »
Ke Wang · Vidya Muthukumar · Christos Thrampoulidis -
2020 Poster: Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View »
Christos Thrampoulidis · Samet Oymak · Mahdi Soltanolkotabi -
2020 Poster: Stage-wise Conservative Linear Bandits »
Ahmadreza Moradipari · Christos Thrampoulidis · Mahnoosh Alizadeh