Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Learning and Decision-Making with Strategic Feedback (StratML)

Reward-Free Attacks in Multi-Agent Reinforcement Learning

Ted Fujimoto · Tim Doster · Adam Attarian · Jill Brandenberger · Nathan Hodas


Abstract:

We investigate how effective an attacker can be when it only learns from its victim's actions, without access to the victim's reward. In this work, we are motivated by the scenario where the attacker wants to disrupt real-world RL applications, such as autonomous vehicles or delivery drones, without knowing the victim's precise goals or reward function. We argue that one heuristic approach an attacker can use is to strategically maximize the entropy of the victim's policy. The policy is generally not obfuscated, which implies it may be extracted simply by passively observing the victim. We provide such a strategy in the form of a reward-free exploration algorithm that maximizes the attacker's entropy during the exploration phase, and it then maximizes the victim's empirical entropy during the planning phase. In our experiments, the victim agents are subverted through policy entropy maximization, implying an attacker might not need access to the victim’s reward to succeed. Hence, even if the victim's reward information is protected, reward-free attacks, based only on observing behavior, underscore the need to better understand policy obfuscation when preparing to deploy reinforcement learning in real world applications.

Chat is not available.