Skip to yearly menu bar Skip to main content


Poster

Confidence Regulation Neurons in Language Models

Alessandro Stolfo ⋅ Ben Wu ⋅ Wes Gurnee ⋅ Yonatan Belinkov ⋅ Xingyi Song ⋅ Mrinmaya Sachan ⋅ Neel Nanda
2024 Poster

Abstract

Video

Chat is not available.