Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Safe and Robust Control of Uncertain Systems

Avoiding Negative Side Effects by Considering Others

Parand Alizadeh Alamdari · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith


Abstract:

Recent work in AI safety has highlighted that in sequential decision making, objectives are often underspecified or incomplete. This potentially allows the AI agent to make undesirable changes to the world while achieving its given objective. A number of recent papers have proposed avoiding such negative side effects by giving an auxiliary reward to the agent for preserving its own ability to complete tasks or gain reward. We argue that effects on others need to be explicitly considered and provide a formulation that generalizes prior work. We experimentally investigate our approach with RL agents in gridworlds.