Skip to yearly menu bar Skip to main content

Workshop: Attributing Model Behavior at Scale (ATTRIB)

Data Attribution for Segmentation Models

Albert Tam · Joshua Vendrow · Aleksander Madry

Abstract: The quality of segmentation models is driven by their training datasets labeled with detailed segmentation masks. How does the composition of such a training dataset contribute to the performance of the resulting segmentation model? In this work, we take a step towards attaining such an understanding by applying the lens of data attribution to it. To this end, We first identify specific behaviors of these models to attribute, and then provide a method for computing such attributions efficiently. We validate the resulting attributions, and leverage them to both identify harmful labeling errors and curate a $50$\% subset of the MS COCO training dataset that leads to a $2.79$\% $\pm$ $0.49$\% increase in mIOU over the full dataset.

Chat is not available.