Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

48 Results

<<   <   Page 1 of 4   >   >>
Workshop
Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace
Sahar Rajabi · Sirisha Rambhatla
Poster
Wed 11:00 Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang · Bo Liu · Lizhang Chen · Qiang Liu
Workshop
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
Keivan Alizadeh-Vahid · Iman Mirzadeh · Hooman Shahrkokhi · Dmitry Belenko · Frank Sun · Minsik Cho · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar
Workshop
Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking
Sahar Rajabi · Sirisha Rambhatla
Workshop
For Perception Tasks: The Cost of LLM Pretraining by Next-Token Prediction Outweigh its Benefits
Randall Balestriero · Hai Huang
Workshop
Eagle: Efficient Training-Free Router for Multi-LLM Inference
Zesen Zhao · Shuowei Jin · Zhuoqing Morley Mao
Poster
Thu 16:30 Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters
Yifei Xia · Fangcheng Fu · Wentao Zhang · Jiawei Jiang · Bin CUI
Poster
Thu 16:30 When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
Xuan Chen · Yuzhou Nie · Wenbo Guo · Xiangyu Zhang
Workshop
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Joao Monteiro · Etienne Marcotte · Pierre-Andre Noel · Valentina Zantedeschi · David Vazquez · Nicolas Chapados · Christopher Pal · Perouz Taslakian
Workshop
Sirius: Contextual Sparsity with Correction for Efficient LLM
Yang Zhou · Zhuoming Chen · Zhaozhuo Xu · Victoria Lin · Beidi Chen
Workshop
Sat 10:30 Internalizing ASR with Implicit Chain of Thought for Efficient Speech-to-Speech Conversational LLM
Robin Shing-Hei Yuen · Timothy Tse · Jian Zhu
Workshop
Towards Low-bit Communication for Tensor Parallel LLM Inference
Harry Dong · Tyler Johnson · Minsik Cho · Emad Soroush