Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

79 Results

<<   <   Page 7 of 7   >>   >
Poster
Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration
Zhuofan Wen · Shangtong Gui · Yang Feng
Poster
Fri 16:30 Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting
Fangcheng Liu · Yehui Tang · Zhenhua Liu · Yunsheng Ni · Duyu Tang · Kai Han · Yunhe Wang
Poster
Wed 16:30 Fast Best-of-N Decoding via Speculative Rejection
Hanshi Sun · Momin Haider · Ruiqi Zhang · Huitao Yang · Jiahao Qiu · Ming Yin · Mengdi Wang · Peter Bartlett · Andrea Zanette
Poster
Wed 11:00 Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms
Firas Trabelsi · David Vilar · Mara Finkelstein · Markus Freitag
Poster
Wed 11:00 Sequoia: Scalable and Robust Speculative Decoding
Zhuoming Chen · Avner May · Ruslan Svirschevski · Yu-Hsun Huang · Max Ryabinin · Zhihao Jia · Beidi Chen
Poster
Fri 11:00 SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices
Ruslan Svirschevski · Avner May · Zhuoming Chen · Beidi Chen · Zhihao Jia · Max Ryabinin
Poster
Thu 11:00 Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Ethan Shen · Alan Fan · Sarah Pratt · Jae Sung Park · Matthew Wallingford · Sham Kakade · Ari Holtzman · Ranjay Krishna · Ali Farhadi · Aditya Kusupati