Timezone: »
Informally, a 'spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter. In machine learning, these have a know-it-when-you-see-it character; e.g., changing the gender of a sentence's subject changes a sentiment predictor's output. To check for spurious correlations, we can 'stress test' models by perturbing irrelevant parts of input data and seeing if model predictions change. In this paper, we study stress testing using the tools of causal inference. We introduce counterfactual invariance as a formalization of the requirement that changing irrelevant parts of the input shouldn't change model predictions. We connect counterfactual invariance to out-of-domain model performance, and provide practical schemes for learning (approximately) counterfactual invariant predictors (without access to counterfactual examples). It turns out that both the means and implications of counterfactual invariance depend fundamentally on the true underlying causal structure of the data---in particular, whether the label causes the features or the features cause the label. Distinct causal structures require distinct regularization schemes to induce counterfactual invariance. Similarly, counterfactual invariance implies different domain shift guarantees depending on the underlying causal structure. This theory is supported by empirical results on text classification.
Author Information
Victor Veitch (University of Chicago, Google)
Alexander D'Amour (UC Berkeley)
Steve Yadlowsky (Stanford University)
Jacob Eisenstein (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Counterfactual Invariance to Spurious Correlations in Text Classification »
Tue. Dec 7th 04:30 -- 06:00 PM Room
More from the Same Authors
-
2021 Spotlight: SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression »
Steve Yadlowsky · Taedong Yun · Cory Y McLean · Alexander D'Amour -
2021 : Using Embeddings to Estimate Peer Influence on Social Networks »
Irina Cristali · Victor Veitch -
2021 : Mitigating Overlap Violations in Causal Inference with Text Data »
Lin Gui · Victor Veitch -
2021 : Using Embeddings to Estimate Peer Influence on Social Networks »
Irina Cristali · Victor Veitch -
2022 : Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model »
Jacob Eisenstein · Daniel Andor · Bernd Bohnet · Michael Collins · David Mimno -
2022 : Causal Estimation for Text Data with (Apparent) Overlap Violations »
Lin Gui · Victor Veitch -
2023 Poster: Uncovering Meanings of Embeddings via Markov Boundaries »
Yibo Jiang · Bryon Aragam · Victor Veitch -
2023 Poster: Concept Algebra for Score-based Conditional Model »
Zihao Wang · Lin Gui · Jeffrey Negrea · Victor Veitch -
2023 Poster: Using Causal Context to Select Algorithmic Fairness Metrics »
Jacy Anthis · Victor Veitch -
2023 Workshop: Workshop on Distribution Shifts: New Frontiers with Foundation Models »
Rebecca Roelofs · Fanny Yang · Hongseok Namkoong · Masashi Sugiyama · Jacob Eisenstein · Pang Wei Koh · Shiori Sagawa · Tatsunori Hashimoto · Yoonho Lee -
2022 Workshop: Workshop on Distribution Shifts: Connecting Methods and Applications »
Chelsea Finn · Fanny Yang · Hongseok Namkoong · Masashi Sugiyama · Jacob Eisenstein · Jonas Peters · Rebecca Roelofs · Shiori Sagawa · Pang Wei Koh · Yoonho Lee -
2022 Poster: Using Embeddings for Causal Estimation of Peer Influence in Social Networks »
Irina Cristali · Victor Veitch -
2022 Poster: Invariant and Transportable Representations for Anti-Causal Domain Shifts »
Yibo Jiang · Victor Veitch -
2021 Poster: SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression »
Steve Yadlowsky · Taedong Yun · Cory Y McLean · Alexander D'Amour -
2020 Poster: Sense and Sensitivity Analysis: Simple Post-Hoc Analysis of Bias Due to Unobserved Confounding »
Victor Veitch · Anisha Zaveri -
2020 Spotlight: Sense and Sensitivity Analysis: Simple Post-Hoc Analysis of Bias Due to Unobserved Confounding »
Victor Veitch · Anisha Zaveri -
2019 : Coffee break, posters, and 1-on-1 discussions »
Julius von Kügelgen · David Rohde · Candice Schumann · Grace Charles · Victor Veitch · Vira Semenova · Mert Demirer · Vasilis Syrgkanis · Suraj Nair · Aahlad Puli · Masatoshi Uehara · Aditya Gopalan · Yi Ding · Ignavier Ng · Khashayar Khosravi · Eli Sherman · Shuxi Zeng · Aleksander Wieczorek · Hao Liu · Kyra Gan · Jason Hartford · Miruna Oprescu · Alexander D'Amour · Jörn Boehnke · Yuta Saito · Théophile Griveau-Billion · Chirag Modi · Shyngys Karimov · Jeroen Berrevoets · Logan Graham · Imke Mayer · Dhanya Sridhar · Issa Dahabreh · Alan Mishler · Duncan Wadsworth · Khizar Qureshi · Rahul Ladhania · Gota Morishita · Paul Welle -
2019 Poster: Using Embeddings to Correct for Unobserved Confounding in Networks »
Victor Veitch · Yixin Wang · David Blei -
2019 Poster: Adapting Neural Networks for the Estimation of Treatment Effects »
Claudia Shi · David Blei · Victor Veitch -
2017 Poster: Reducing Reparameterization Gradient Variance »
Andrew Miller · Nick Foti · Alexander D'Amour · Ryan Adams -
2015 : The general class of (sparse) random graphs arising from exchangeable point processes »
Victor Veitch