Skip to yearly menu bar Skip to main content


Spotlight Poster

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models

Frederik Kunstner · Alan Milligan · Robin Yadav · Mark Schmidt · Alberto Bietti
2024 Spotlight Poster
[ Paper [ Poster [ OpenReview

Abstract

Video

Chat is not available.