Skip to yearly menu bar Skip to main content


Poster

Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads

Zhoutong Wu ⋅ Yuan Zhang ⋅ Yiming Dong ⋅ Chenheng Zhang ⋅ Cong Fang ⋅ Kun Yuan ⋅ Zhouchen Lin
2025 Poster

Abstract

Video

Chat is not available.