Skip to yearly menu bar Skip to main content

Workshop: Workshop on Human and Machine Decisions

Will We Trust What We Don’t Understand? Impact of Model Interpretability and Outcome Feedback on Trust in AI

Daehwan Ahn · Abdullah Almaatouq · Monisha Gulabani · Kartik Hosanagar


Despite AI’s superhuman performance in a variety of domains, humans are often unwilling to adopt algorithms. The lack of interpretability inherent in many modern-day AI techniques is believed to be hurting algorithm adoption, as users may not trust systems whose decision processes they don’t understand. We investigate this proposition with a novel experiment in which we use an interactive prediction task to analyze the impact of interpretability as well as outcome feedback on trust in AI and performance in the prediction task. We find that interpretability led to no robust improvements in trust, while outcome feedback had a significantly greater and more reliable effect. However, neither factor had more than minimal effects on performance in the task. Our findings suggest that (1) factors receiving significant attention, such as interpretability, may be less effective at increasing trust than factors like outcome feedback, and (2) augmenting human performance via AI systems may not be a simple matter of increasing trust in AI, as increased trust is not always associated with equally sizable improvements in performance. These findings clarify for companies and product designers that providing interpretations alone may not be sufficient to solve challenges around user trust in AI products, while also highlighting that certain other features may be more effective. These findings also invite the research community to not only focus on methods for generating interpretations but also on methods for ensuring that interpretations impact trust and performance in practice, such as how to present interpretations to users.

Chat is not available.