firstbacksecondback
79 Results
Poster
|
Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration Zhuofan Wen · Shangtong Gui · Yang Feng |
||
Poster
|
Fri 16:30 |
Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting Fangcheng Liu · Yehui Tang · Zhenhua Liu · Yunsheng Ni · Duyu Tang · Kai Han · Yunhe Wang |
|
Poster
|
Wed 16:30 |
Fast Best-of-N Decoding via Speculative Rejection Hanshi Sun · Momin Haider · Ruiqi Zhang · Huitao Yang · Jiahao Qiu · Ming Yin · Mengdi Wang · Peter Bartlett · Andrea Zanette |
|
Poster
|
Wed 11:00 |
Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms Firas Trabelsi · David Vilar · Mara Finkelstein · Markus Freitag |
|
Poster
|
Wed 11:00 |
Sequoia: Scalable and Robust Speculative Decoding Zhuoming Chen · Avner May · Ruslan Svirschevski · Yu-Hsun Huang · Max Ryabinin · Zhihao Jia · Beidi Chen |
|
Poster
|
Fri 11:00 |
SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices Ruslan Svirschevski · Avner May · Zhuoming Chen · Beidi Chen · Zhihao Jia · Max Ryabinin |
|
Poster
|
Thu 11:00 |
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass Ethan Shen · Alan Fan · Sarah Pratt · Jae Sung Park · Matthew Wallingford · Sham Kakade · Ari Holtzman · Ranjay Krishna · Ali Farhadi · Aditya Kusupati |