Skip to yearly menu bar Skip to main content

Workshop: XAI in Action: Past, Present, and Future Applications

Utilizing Explainability Techniques for Reinforcement Learning Model Assurance

Alexander Tapley

[ ] [ Project Page ]
Sat 16 Dec 12:01 p.m. PST — 1 p.m. PST


Explainable Reinforcement Learning (XRL) can provide transparency into the decision-making process of a Reinforcement Learning (RL) model and increase user trust and adoption into real-world use cases. By utilizing XRL techniques, researchers can identify potential vulnerabilities within a trained RL model prior to deployment, therefore limiting the potential for mission failure or mistakes by the system. This paper introduces the ARLIN (Assured RL Model Interrogation) Toolkit, a Python library that provides explainability outputs for trained RL models that can be used to identify potential policy vulnerabilities and critical points. Using XRL datasets, ARLIN provides detailed analysis into an RL model's latent space, creates a semi-aggregated Markov decision process (SAMDP) to outline the model's path throughout an episode, and produces cluster analytics for each node within the SAMDP to identify potential failure points and vulnerabilities within the model. To illustrate ARLIN's effectiveness, we provide sample API usage and corresponding explainability visualizations and vulnerability point detection for a publicly available RL model. The open-source code repository is available for download at (GitHub link forthcoming).

Chat is not available.