Skip to yearly menu bar Skip to main content


Poster Fri, Dec 5, 2025 • 4:30 PM – 7:30 PM PST

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Di Liu ⋅ Meng Chen ⋅ Baotong Lu ⋅ Huiqiang Jiang ⋅ Zhenhua Han ⋅ Qianxi Zhang ⋅ Qi Chen ⋅ Chengruidong Zhang ⋅ Bailu Ding ⋅ Kai Zhang ⋅ Chen Chen ⋅ Fan Yang ⋅ Yuqing Yang ⋅ Lili Qiu

Abstract

Video

Chat is not available.