Skip to yearly menu bar Skip to main content


Speculative Streaming: Fast LLM Inference without Auxiliary Models

Nikhil Bhendawade · Mahyar Najibi · Irina Belousova · Qichen Fu · Henry Mason · Mohammad Rastegari
[ Slides [ Poster

Abstract

Video

Chat is not available.