Timezone: »
Structural analysis methods (e.g., probing and feature attribution) are increasingly important tools for neural network analysis. We propose a new structural analysis method grounded in a formal theory of causal abstraction that provides rich characterizations of model-internal representations and their roles in input/output behavior. In this method, neural representations are aligned with variables in interpretable causal models, and then interchange interventions are used to experimentally verify that the neural representations have the causal properties of their aligned variables. We apply this method in a case study to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus, a highly complex NLI dataset that was constructed with a tree-structured natural logic causal model. We discover that a BERT-based model with state-of-the-art performance successfully realizes parts of the natural logic model’s causal structure, whereas a simpler baseline model fails to show any such structure, demonstrating that neural representations encode the compositional structure of MQNLI examples.
Author Information
Atticus Geiger (Stanford University)
Hanson Lu (Stanford University)
Thomas Icard (Stanford University)
Christopher Potts (Stanford University)
More from the Same Authors
-
2021 : ReaSCAN: Compositional Reasoning in Language Grounding »
Zhengxuan Wu · Elisa Kreiss · Desmond Ong · Christopher Potts -
2021 Spotlight: Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval »
Omar Khattab · Christopher Potts · Matei Zaharia -
2023 Poster: Interpretability at Scale: Identifying Causal Mechanisms in Alpaca »
Zhengxuan Wu · Atticus Geiger · Christopher Potts · Noah Goodman -
2023 Poster: Comparing Causal Frameworks: Potential Outcomes, Structural Models, Graphs, and Abstractions »
Duligur Ibeling · Thomas Icard -
2022 Poster: CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior »
Eldar D Abraham · Karel D'Oosterlinck · Amir Feder · Yair Gat · Atticus Geiger · Christopher Potts · Roi Reichart · Zhengxuan Wu -
2021 : Intuitive Image Descriptions are Context-Sensitive »
Shayan Hooshmand · Elisa Kreiss · Christopher Potts -
2021 : Thomas Icard - A (topo)logical perspective on causal inference »
Thomas Icard -
2021 Poster: Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP »
Josh Rozner · Christopher Potts · Kyle Mahowald -
2021 Poster: A Topological Perspective on Causal Inference »
Duligur Ibeling · Thomas Icard -
2021 Poster: Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking »
Zhiyi Ma · Kawin Ethayarajh · Tristan Thrush · Somya Jain · Ledell Wu · Robin Jia · Christopher Potts · Adina Williams · Douwe Kiela -
2021 Poster: Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval »
Omar Khattab · Christopher Potts · Matei Zaharia