Skip to yearly menu bar Skip to main content


(3 events)   Timezone:  
Show all
Toggle Poster Visibility
Mexico City Oral
Thu Dec 04 03:30 PM -- 03:50 PM (PST) @ Don Alberto 1 None
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
David Chanin · James Wilken-Smith · Tomáš Dulka · Hardik Bhatnagar · Satvik Golechha · Joseph Bloom
Mexico City Oral
Thu Dec 04 03:50 PM -- 04:10 PM (PST) @ Don Alberto 1 None
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Zihan Qiu · Zekun Wang · Bo Zheng · Zeyu Huang · Kaiyue Wen · Songlin Yang · Rui Men · Le Yu · Fei Huang · Suozhi Huang · Dayiheng Liu · Jingren Zhou · Junyang Lin
Mexico City Oral
Thu Dec 04 04:10 PM -- 04:30 PM (PST) @ Don Alberto 1 None
Superposition Yields Robust Neural Scaling
Yizhou Liu · Ziming Liu · Jeff Gore