Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

4 Results

<<   <   Page 1 of 1   >>   >
Workshop
Sat 13:00 Panel: On Linear Representations and Pretraining Data Frequency in Language Models When Attention Sink Emerges in Language Models: An Empirical View Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations U-shape
Workshop
When Attention Sink Emerges in Language Models: An Empirical View
Xiangming Gu · Tianyu Pang · Chao Du · Qian Liu · Fengzhuo Zhang · Cunxiao Du · Ye Wang · Min Lin
Workshop
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo · Druv Pai · Yu Bai · Jiantao Jiao · Michael Jordan · Song Mei
Workshop
Sat 15:30 Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo · Druv Pai · Yu Bai · Jiantao Jiao · Michael Jordan · Song Mei