Timezone: »

Learning to Learn from Imperfect Demonstrations
Ge Yang · Chelsea Finn

Sat Dec 08 07:15 AM -- 07:30 AM (PST) @ None

In the standard formulation of imitation learning, the agent starts from scratch without the means to take advantage of an informative prior. As a result, the expert's demonstrations have to either be optimal, or contain a known mode of sub-optimality that could be modeled. In this work, we consider instead the problem of imitation learning from imperfect demonstrations where a small number of demonstrations containing unstructured imperfections is available. In particular, these demonstrations contain large systematic biases, or fails to complete the task in unspecified ways. Our Learning to Learn From Imperfect Demonstrations (LID) framework casts such problem as a meta-learning problem, where the agent meta-learns a robust imitation algorithm that is able to infer the correct policy despite of these imperfections, by taking advantage of an informative prior. We demonstrate the robustness of this algorithm over 2D reaching tasks, multitask door opening and picking tasks with a simulated robot arm, where the demonstration merely gestures for the intended target. Despite not seeing a demonstration that completes the task, the agent is able to draw lessons from its prior experience--correctly inferring a policy that accomplishes the task where the demonstration fails to.

Author Information

Ge Yang (Berkeley)
Chelsea Finn (UC Berkeley)

More from the Same Authors