firstbacksecondback
46 Results
Poster
|
Wed 15:00 |
The emergence of clusters in self-attention dynamics Borjan Geshkovski · Cyril Letrouit · Yury Polyanskiy · Philippe Rigollet |
|
Poster
|
Thu 8:45 |
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer Yuandong Tian · Yiping Wang · Beidi Chen · Simon Du |
|
Poster
|
Tue 15:15 |
Random-Access Infinite Context Length for Transformers Amirkeivan Mohtashami · Martin Jaggi |
|
Workshop
|
Think before you speak: Training Language Models With Pause Tokens Sachin Goyal · Ziwei Ji · Ankit Rawat · Aditya Menon · Sanjiv Kumar · Vaishnavh Nagarajan |
||
Poster
|
Wed 8:45 |
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars Kaiyue Wen · Yuchen Li · Bingbin Liu · Andrej Risteski |
|
Workshop
|
Deep Multimodal Emotion Recognition using Modality Aware Attention Network for Unifying Representations in Neural Models Sungpil Woo · MUHAMMAD ZUBAIR · Sunhwan Lim · Daeyoung Kim |
||
Poster
|
Tue 15:15 |
Coneheads: Hierarchy Aware Attention Albert Tseng · Tao Yu · Toni Liu · Christopher De Sa |
|
Poster
|
Wed 15:00 |
A Hierarchical Spatial Transformer for Massive Point Samples in Continuous Space Wenchong He · Zhe Jiang · Tingsong Xiao · Zelin Xu · Shigang Chen · Ronald Fick · MILES MEDINA · Christine Angelini |
|
Poster
|
Tue 15:15 |
Polyhedron Attention Module: Learning Adaptive-order Interactions Tan Zhu · Fei Dou · Xinyu Wang · Jin Lu · Jinbo Bi |
|
Poster
|
Wed 15:00 |
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection Yu Bai · Fan Chen · Huan Wang · Caiming Xiong · Song Mei |
|
Poster
|
Wed 8:45 |
The Impact of Positional Encoding on Length Generalization in Transformers Amirhossein Kazemnejad · Inkit Padhi · Karthikeyan Natesan Ramamurthy · Payel Das · Siva Reddy |
|
Poster
|
Thu 8:45 |
Unlimiformer: Long-Range Transformers with Unlimited Length Input Amanda Bertsch · Uri Alon · Graham Neubig · Matthew Gormley |