Consider the problem of estimating the causal effect of some attribute of a text document; for example: what effect does writing a polite vs. rude email have on response time? To estimate a causal effect from observational data, we need to control for confounding by adjusting for a set of covariates X that includes all common causes of the treatment and outcome. For this adjustment to work, the data must satisfy overlap: the probability of treatment should be bounded away from 0 and 1 for all levels of X. In the text setting, we can try to satisfy the requirement that we adjust for all common causes by adjusting for all the text. However, when the treatment is an attribute of the text, this violates overlap. The main goal of this paper is to develop an alternative approach that allows us to adjust for a “part” of the text that is large enough to control for confounding but small enough to avoid overlap violations. We propose a procedure that can identify and throw away the part of the text that is only predictive of the treatment. This information is not necessary to control for confounding (it does not affect the outcome) and so can be safely removed. On the other hand, if the removed information was necessary for perfect treatment prediction, then overlap will be recovered. We adapt deep models and propose a learning strategy to recognize multiple representations with different prediction properties. The procedure explicitly divides a (BERT) embedding of the text into one piece relevant to the outcome and one relevant to the treatment only. A regularization term is included to enforce this structure. Early empirical results show that our method effectively detects an appropriate confounding variable and mitigates the overlap issue.