Skip to yearly menu bar Skip to main content


Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training

Aadim Nepal · Safal Shrestha · Anubhav Shrestha · Minwu Kim · Jalal Naghiyev · Ravid Shwartz-Ziv · Keith Ross

Abstract

Chat is not available.