Skip to yearly menu bar Skip to main content

Workshop: I Can’t Believe It’s Not Better (ICBINB): Failure Modes in the Age of Foundation Models

Towards Better Understanding of Domain Shift on Linear-Probed Visual Foundation Models

Eric Heim


Visual foundation models have emerged in recent years to offer similar promise as their language counterparts: The ability to produce representations of visual data that can be successfully used in a variety of tasks and contexts. One common way this is shown in published literature is through ``domain generalization'' experiments of linear models trained from representations produced by foundation models (i.e. linear probes). These experiments largely limit themselves to a small number of benchmark data sets and report accuracy as the single figure of merit, but give little insight beyond these numbers as to how different foundation models represent shifts.In this work we perform an empirical evaluation that expands the scope of previous reported results in order to give better understanding into how domain shifts are modeled. Namely, we investigate not just how models generalize across domains, but how models produce features that may enable domain transfer. Our evaluation spans a number of recent visual foundation models and benchmarks, and we provide discussion that emphasizes the need for further investigation.

Chat is not available.