Skip to yearly menu bar Skip to main content


Poster

Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM

Zinuo Li ⋅ Xian Zhang ⋅ Yongxin Guo ⋅ Mohammed Bennamoun ⋅ Farid Boussaid ⋅ Girish Dwivedi ⋅ Luqi Gong ⋅ Qiuhong Ke
2025 Poster

Abstract

Video

Chat is not available.