How can we build autonomous robots that operate in unstructured and dynamic environments such as homes or hospitals? This problem has been investigated under several disciplines, including planning (motion planning, task planning, etc.), and reinforcement learning. While both of these fields have witnessed tremendous progress, each have fundamental drawbacks: planning approaches require substantial manual engineering in mapping perception to a formal planning problem, while RL, which can operate directly on raw percepts, is data hungry, cannot generalize to new tasks, and is ‘black box’ in nature.
Motivated by humans’ remarkable capability to imagine and plan complex manipulations of objects, and recent advances in imagining images such as GANs, we present Visual Plan Imagination (VPI) — a new computational problem that combines image imagination and planning. In VPI, given off-policy image data from a dynamical system, the task is to ‘imagine’ image sequences that transition the system from start to goal. Thus, VPI focuses on the essence of planning with high-dim perception, and abstracts away low level control and reward engineering. More importantly, VPI provides a safe and interpretable basis for robotic control — before the robot acts, a human inspects the imagined plan the robot will act upon, and can intervene if necessary.
I will describe our approach to VPI based on Causal InfoGAN, a deep generative model that learns features that are compatible with strong planning algorithms. We show that Causal InfoGAN can generate convincing visual plans, and we demonstrate learning to imagine and execute real robot rope manipulation from image data. I will also discuss our VPI simulation benchmarks, and recent efforts in novelty detection, an important component in VPI, and in safe decision making in general.
Aviv Tamar (Technion)
More from the Same Authors
2015 Poster: Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach »
Yinlam Chow · Aviv Tamar · Shie Mannor · Marco Pavone
2015 Poster: Policy Gradient for Coherent Risk Measures »
Aviv Tamar · Yinlam Chow · Mohammad Ghavamzadeh · Shie Mannor
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar