Skip to yearly menu bar Skip to main content


RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Di Liu ⋅ Meng Chen ⋅ Baotong Lu ⋅ Huiqiang Jiang ⋅ Zhenhua Han ⋅ Qianxi Zhang ⋅ Qi Chen ⋅ Chengruidong Zhang ⋅ Bailu Ding ⋅ Kai Zhang ⋅ Chen Chen ⋅ Fan Yang ⋅ Yuqing Yang ⋅ Lili Qiu

Abstract

Video

Chat is not available.