Skip to yearly menu bar Skip to main content

Workshop: Trustworthy and Socially Responsible Machine Learning

Interactive Rationale Extraction for Text Classification

Jiayi Dai · Mi-Young Kim · Randolph Goebel


Deep neural networks show superior performance in text classification tasks, but their poor interpretability and explainability can cause trust issues. For text classification problems, the identification of textual sub-phrases or ``rationales'' is one strategy for attempting to find the most influential portions of text, which can be conveyed as critical in making classification decisions. Selective models for rationale extraction faithfully explain a neural classifier's predictions by training a rationale generator and a text classifier jointly: the generator identifies rationales and the classifier predicts a category solely based on the rationales. The selected rationales are then viewed as the explanations for the classifier's predictions. Through exchange of such explanations, humans interact to achieve higher performances in problem solving. To imitate the interactive process of humans, we propose a simple interactive rationale extraction architecture that selects a pair of rationales and then makes predictions from two independently trained selective models. We show how this architecture outperforms both base models for text classification tasks on datasets IMDB movie reviews and 20 Newsgroups in terms of predictive performance.

Chat is not available.