Skip to yearly menu bar Skip to main content


Poster

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Sihan Chen ⋅ Handong Li ⋅ Qunbo Wang ⋅ Zijia Zhao ⋅ Mingzhen Sun ⋅ Xinxin Zhu ⋅ Jing Liu
2023 Poster

Abstract

Video

Chat is not available.