Poster
|
Tue 14:00
|
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Antoine Yang · Antoine Miech · Josef Sivic · Ivan Laptev · Cordelia Schmid
|
|
Poster
|
Tue 14:00
|
Grounded Video Situation Recognition
Zeeshan Khan · C.V. Jawahar · Makarand Tapaswi
|
|
Workshop
|
Fri 6:00
|
Fine-grained Interactive Vision Language Pre-training
Lu Hou · Lu Hou
|
|
Poster
|
Tue 14:00
|
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa · Carlos Riquelme · Joan Puigcerver · Rodolphe Jenatton · Neil Houlsby
|
|
Workshop
|
Sat 7:45
|
Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
|
|
Poster
|
Tue 9:00
|
CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation
Zicheng Zhang · Yi Zhu · Jianzhuang Liu · Xiaodan Liang · Wei Ke
|
|
Workshop
|
|
Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
|
|
Poster
|
Thu 9:00
|
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
Jin-Hwa Kim · Yunji Kim · Jiyoung Lee · Kang Min Yoo · Sang-Woo Lee
|
|