Timezone: »
This paper investigates policy resilience to training-environment poisoning attacks on reinforcement learning (RL) policies, with the goal of recovering the deployment performance of a poisoned RL policy. Due to the fact that policy resilience is an add-on concern to RL algorithms, it must be resource-efficient, time-conserving, and widely applicable without compromising the performance of RL algorithms.This paper proposes such a policy-resilience mechanism based on an idea of sharing the environment knowledge. We summarize the policy resilience as three stages: preparation, diagnosis, recovery. Specifically, we design the mechanism as a federated architecture coupled with a meta-learning approach, pursuing an efficient extraction and sharing of environment knowledge. With the shared knowledge, a poisoned agent can quickly identify the deployment condition and accordingly recover its policy performance. We empirically evaluate the resilience mechanism for both model-based and model-free RL algorithms, showing its effectiveness and efficiency in restoring the deployment performance of a poisoned policy.
Author Information
Hang Xu (Nanyang Technological University)
Zinovi Rabinovich (Nanyang Technological University)
More from the Same Authors
-
2021 Poster: RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents »
Wei Qiu · Xinrun Wang · Runsheng Yu · Rundong Wang · Xu He · Bo An · Svetlana Obraztsova · Zinovi Rabinovich