Skip to yearly menu bar Skip to main content


SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

Haoxiang Wang ⋅ Pavan Kumar Anasosalu Vasu ⋅ Fartash Faghri ⋅ Raviteja Vemulapalli ⋅ Mehrdad Farajtabar ⋅ Sachin Mehta ⋅ Mohammad Rastegari ⋅ Oncel Tuzel ⋅ Hadi Pouransari

Abstract

Video

Chat is not available.