firstbacksecondback
48 Results
Workshop
|
Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace Sahar Rajabi · Sirisha Rambhatla |
||
Poster
|
Wed 11:00 |
Memory-Efficient LLM Training with Online Subspace Descent Kaizhao Liang · Bo Liu · Lizhang Chen · Qiang Liu |
|
Workshop
|
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models Keivan Alizadeh-Vahid · Iman Mirzadeh · Hooman Shahrkokhi · Dmitry Belenko · Frank Sun · Minsik Cho · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar |
||
Workshop
|
Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking Sahar Rajabi · Sirisha Rambhatla |
||
Workshop
|
For Perception Tasks: The Cost of LLM Pretraining by Next-Token Prediction Outweigh its Benefits Randall Balestriero · Hai Huang |
||
Workshop
|
Eagle: Efficient Training-Free Router for Multi-LLM Inference Zesen Zhao · Shuowei Jin · Zhuoqing Morley Mao |
||
Poster
|
Thu 16:30 |
Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters Yifei Xia · Fangcheng Fu · Wentao Zhang · Jiawei Jiang · Bin CUI |
|
Poster
|
Thu 16:30 |
When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search Xuan Chen · Yuzhou Nie · Wenbo Guo · Xiangyu Zhang |
|
Workshop
|
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference Joao Monteiro · Etienne Marcotte · Pierre-Andre Noel · Valentina Zantedeschi · David Vazquez · Nicolas Chapados · Christopher Pal · Perouz Taslakian |
||
Workshop
|
Sirius: Contextual Sparsity with Correction for Efficient LLM Yang Zhou · Zhuoming Chen · Zhaozhuo Xu · Victoria Lin · Beidi Chen |
||
Workshop
|
Sat 10:30 |
Internalizing ASR with Implicit Chain of Thought for Efficient Speech-to-Speech Conversational LLM Robin Shing-Hei Yuen · Timothy Tse · Jian Zhu |
|
Workshop
|
Towards Low-bit Communication for Tensor Parallel LLM Inference Harry Dong · Tyler Johnson · Minsik Cho · Emad Soroush |