Skip to yearly menu bar Skip to main content


Poster Wed, Dec 3, 2025 • 11:00 AM – 2:00 PM PST

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding

Ahmed Masry ⋅ Juan Rodriguez ⋅ Tianyu Zhang ⋅ Suyuchen Wang ⋅ Chao Wang ⋅ Aarash Feizi ⋅ Akshay Kalkunte Suresh ⋅ Abhay Puri ⋅ Xiangru Jian ⋅ Pierre-André Noël ⋅ Sathwik Tejaswi Madhusudhan ⋅ Marco Pedersoli ⋅ Bang Liu ⋅ Nicolas Chapados ⋅ Yoshua Bengio ⋅ Enamul Hoque ⋅ Chris Pal ⋅ Issam Hadj Laradji ⋅ David Vazquez ⋅ Perouz Taslakian ⋅ Spandana Gella ⋅ Sai Rajeswar Mudumba

Abstract

Video

Chat is not available.