Skip to yearly menu bar Skip to main content


Distributed Speculative Inference of Large Language Models is Provably Faster

Nadav Timor · Jonathan Mamou · Oren Pereg · Moshe Berchansky · Daniel Korat · Moshe Wasserblat · Tomer Galanti · Michal Gordon (Kiwkowitz) · David Harel

Abstract

Video

Chat is not available.