Workshop
|
Sat 10:30
|
3D Audio-Visual Segmentation
Artem Sokolov · Swapnil Bhosale · Xiatian Zhu
|
|
Poster
|
Wed 16:30
|
Aligning Audio-Visual Joint Representations with an Agentic Workflow
Shentong Mo · Yibing Song
|
|
Poster
|
Thu 16:30
|
Mixtures of Experts for Audio-Visual Learning
Ying Cheng · Yang Li · Junjie He · Rui Feng
|
|
Workshop
|
|
Explainable Audio-Visual Representation Learning via Prototypical Contrastive Masked Autoencoder
Yi Li · Plamen P Angelov
|
|
Workshop
|
|
Multimodal Integration in Audio-Visual Speech Recognition --- How Far Are We From Human-Level Robustness?
Marianne Schweitzer · Anna Montagnini · Abdellah Fourtassi · Thomas Schatz
|
|
Workshop
|
|
Less is Enough: Adapting Pre-trained Vision Transformers for Audio-Visual Speaker Verification
Gnana Praveen Rajasekhar · MD JAHANGIR ALAM
|
|
Poster
|
Wed 11:00
|
Continual Audio-Visual Sound Separation
Weiguo Pian · Yiyang Nan · Shijian Deng · Shentong Mo · Yunhui Guo · Yapeng Tian
|
|
Workshop
|
|
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models
Guangzhi Sun · Potsawee Manakul · Adian Liusie · Kunat Pipatanakul · Chao Zhang · Phil Woodland · Mark Gales
|
|
Poster
|
Fri 11:00
|
Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering
Jie Ma · Min Hu · Pinghui Wang · Wangchun Sun · Lingyun Song · Hongbin Pei · Jun Liu · Youtian Du
|
|
Poster
|
|
SpeechForensics: Audio-Visual Speech Representation Learning for Face Forgery Detection
Yachao Liang · Min Yu · Gang Li · Jianguo Jiang · Boquan Li · Feng Yu · Ning Zhang · Xiang Meng · Weiqing Huang
|
|
Workshop
|
Sat 16:15
|
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation
|
|
Poster
|
Fri 16:30
|
AV-Cloud: Spatial Audio Rendering Through Audio-Visual Cloud Splatting
Mingfei Chen · Eli Shlizerman
|
|