Skip to yearly menu bar Skip to main content


Poster

How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression

Xingwu Chen · Lei Zhao · Difan Zou
2024 Poster

Abstract

Video

Chat is not available.