Timezone: »
Interpreting Deep Reinforcement Learning (DRL) models is important to enhance trust and comply with transparency regulations. Existing methods typically explain a DRL model by visualizing the importance of low-level input features with super-pixels, attentions, or saliency maps. Our approach provides an interpretation based on high-level latent object features derived from a disentangled representation. We propose a Represent And Mimic (RAMi) framework for training 1) an identifiable latent representation to capture the independent factors of variation for the objects and 2) a mimic tree that extracts the causal impact of the latent features on DRL action values. To jointly optimize both the fidelity and the simplicity of a mimic tree, we derive a novel Minimum Description Length (MDL) objective based on the Information Bottleneck (IB) principle. Based on this objective, we describe a Monte Carlo Regression Tree Search (MCRTS) algorithm that explores different splits to find the IB-optimal mimic tree. Experiments show that our mimic tree achieves strong approximation performance with significantly fewer nodes than baseline models. We demonstrate the interpretability of our mimic tree by showing latent traversals, decision rules, causal impacts, and human evaluation results.
Author Information
Guiliang Liu (University of Waterloo)
Xiangyu Sun (Simon Fraser University)
Oliver Schulte (Simon Fraser University)

Bio: Oliver Schulte is a Professor in the School of Computing Science at Simon Fraser University, Vancouver, Canada. He received his Ph.D. from Carnegie Mellon University in 1997. His current research focuses on machine learning for structured, relational, and event data. He has published sports analytics papers in leading AI and machine learning venues, and co-organized two hockey analytics conferences. The last two years he has worked with Sportlogiq, a leading hockey data provider. While he has won some nice awards, his biggest claim to fame may be a draw against chess world champion Gary Kasparov.
Pascal Poupart (University of Waterloo & Vector Institute)
More from the Same Authors
-
2022 Poster: Optimality and Stability in Non-Convex Smooth Games »
Guojun Zhang · Pascal Poupart · Yaoliang Yu -
2022 : Attribute Controlled Dialogue Prompting »
Runcheng Liu · Ahmad Rashid · Ivan Kobyzev · Mehdi Rezaghoizadeh · Pascal Poupart -
2022 : Geometric attacks on batch normalization »
Amur Ghose · Apurv Gupta · Yaoliang Yu · Pascal Poupart -
2022 Spotlight: Optimality and Stability in Non-Convex Smooth Games »
Guojun Zhang · Pascal Poupart · Yaoliang Yu -
2022 Spotlight: Micro and Macro Level Graph Modeling for Graph Variational Auto-Encoders »
Kiarash Zahirnia · Oliver Schulte · Parmis Naddaf · Ke Li -
2022 : Attribute Controlled Dialogue Prompting »
Runcheng Liu · Ahmad Rashid · Ivan Kobyzev · Mehdi Rezaghoizadeh · Pascal Poupart -
2022 Workshop: Second Workshop on Efficient Natural Language and Speech Processing (ENLSP-II) »
Mehdi Rezagholizadeh · Peyman Passban · Yue Dong · Lili Mou · Pascal Poupart · Ali Ghodsi · Qun Liu -
2022 Poster: Uncertainty-Aware Reinforcement Learning for Risk-Sensitive Player Evaluation in Sports Game »
Guiliang Liu · Yudong Luo · Oliver Schulte · Pascal Poupart -
2022 Poster: Micro and Macro Level Graph Modeling for Graph Variational Auto-Encoders »
Kiarash Zahirnia · Oliver Schulte · Parmis Naddaf · Ke Li -
2021 : Best Papers and Closing Remarks »
Ali Ghodsi · Pascal Poupart -
2021 : Panel Discussion »
Pascal Poupart · Ali Ghodsi · Luke Zettlemoyer · Sameer Singh · Kevin Duh · Yejin Choi · Lu Hou -
2021 Workshop: Efficient Natural Language and Speech Processing (Models, Training, and Inference) »
Mehdi Rezaghoizadeh · Lili Mou · Yue Dong · Pascal Poupart · Ali Ghodsi · Qun Liu -
2021 : Opening Speech »
Pascal Poupart -
2021 Poster: Quantifying and Improving Transferability in Domain Generalization »
Guojun Zhang · Han Zhao · Yaoliang Yu · Pascal Poupart -
2020 Poster: Learning Agent Representations for Ice Hockey »
Guiliang Liu · Oliver Schulte · Pascal Poupart · Mike Rudd · Mehrsan Javan -
2020 Poster: Learning Dynamic Belief Graphs to Generalize on Text-Based Games »
Ashutosh Adhikari · Xingdi Yuan · Marc-Alexandre Côté · Mikuláš Zelinka · Marc-Antoine Rondeau · Romain Laroche · Pascal Poupart · Jian Tang · Adam Trischler · Will Hamilton -
2018 Workshop: Reinforcement Learning under Partial Observability »
Joni Pajarinen · Chris Amato · Pascal Poupart · David Hsu -
2018 Poster: Deep Homogeneous Mixture Models: Representation, Separation, and Approximation »
Priyank Jaini · Pascal Poupart · Yaoliang Yu -
2018 Poster: Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks »
Agastya Kalra · Abdullah Rashwan · Wei-Shou Hsu · Pascal Poupart · Prashant Doshi · George Trimponias -
2018 Poster: Unsupervised Video Object Segmentation for Deep Reinforcement Learning »
Vikash Goel · Jameson Weng · Pascal Poupart -
2018 Poster: Monte-Carlo Tree Search for Constrained POMDPs »
Jongmin Lee · Geon-Hyeong Kim · Pascal Poupart · Kee-Eung Kim -
2016 Poster: Online Bayesian Moment Matching for Topic Modeling with Unknown Number of Topics »
Wei-Shou Hsu · Pascal Poupart -
2016 Poster: A Unified Approach for Learning the Parameters of Sum-Product Networks »
Han Zhao · Pascal Poupart · Geoffrey Gordon