Skip to yearly menu bar Skip to main content


S2D: Sorted Speculative Decoding For More Efficient Deployment of Large Language Models

Parsa Kavehzadeh · Mohammadreza Pourreza · Mojtaba Valipour · Tianshu Zhu · Haoli Bai · Ali Ghodsi · Boxing Chen · Mehdi Rezaghoizadeh

Abstract

Video

Chat is not available.