Skip to yearly menu bar Skip to main content


Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training

Aadim Nepal ⋅ Safal Shrestha ⋅ Anubhav Shrestha ⋅ Minwu Kim ⋅ Jalal Naghiyev ⋅ Ravid Shwartz-Ziv ⋅ Keith Ross

Abstract

Chat is not available.