Skip to yearly menu bar Skip to main content


Distributed Speculative Inference of Large Language Models is Provably Faster

Nadav Timor ⋅ Jonathan Mamou ⋅ Oren Pereg ⋅ Moshe Berchansky ⋅ Daniel Korat ⋅ Moshe Wasserblat ⋅ Tomer Galanti ⋅ Michal Gordon (Kiwkowitz) ⋅ David Harel

Abstract

Video

Chat is not available.