Timezone: »

Infer to Control: Probabilistic Reinforcement Learning and Structured Control
Leslie Kaelbling · Martin Riedmiller · Marc Toussaint · Igor Mordatch · Roy Fox · Tuomas Haarnoja

Sat Dec 08 05:00 AM -- 03:30 PM (PST) @ Room 516 CDE
Event URL: https://sites.google.com/view/infer2control-nips2018 »

Reinforcement learning and imitation learning are effective paradigms for learning controllers of dynamical systems from experience. These fields have been empowered by recent success in deep learning of differentiable parametric models, allowing end-to-end training of highly nonlinear controllers that encompass perception, memory, prediction, and decision making. The aptitude of these models to represent latent dynamics, high-level goals, and long-term outcomes is unfortunately curbed by the poor sample complexity of many current algorithms for learning these models from experience.

Probabilistic reinforcement learning and inference of control structure are emerging as promising approaches for avoiding prohibitive amounts of controller–system interactions. These methods leverage informative priors on useful behavior, as well as controller structure such as hierarchy and modularity, as useful inductive biases that reduce the effective size of policy search space and shape the optimization landscape. Intrinsic and self-supervised signals can further guide the training process of distinct internal components — such as perceptual embeddings, predictive models, exploration policies, and inter-agent communication — to break down the hard holistic problem of control into more efficiently learnable parts.

Effective inference methods are crucial for probabilistic approaches to reinforcement learning and structured control. Approximate control and model-free reinforcement learning exploit latent system structure and priors on policy structure, that are not directly evident in the controller–system interactions, and must be inferred by the learning algorithm. The growing interest of the reinforcement learning and optimal control community in the application of inference methods is synchronized well with the development by the probabilistic learning community of powerful inference techniques, such as probabilistic programming, variational inference, Gaussian processes, and nonparametric regression.

This workshop is a venue for the inference and reinforcement learning communities to come together in discussing recent advances, developing insights, and future potential in inference methods and their application to probabilistic reinforcement learning and structured control. The goal of this workshop is to catalyze tighter collaboration within and between the communities, that will be leveraged in upcoming years to rise to the challenges of real-world control problems.

=== Intel AI is proud to sponsor Infer2Control @ NeurIPS 2018 ===
Early detection of tumors. Predicting equipment failures before they happen. Having a natural conversation with your home or car. Making retail more personal than ever. This is Artificial Intelligence powered by Intel, and companies around the globe are using it to make money, save money, and advance the future of their industry. At Intel, we’re using decades of expertise in silicon, software, communications, memory and storage to create the new technologies that AI demands. Technologies that break barriers between data center and edge, server and network, training and inference, model and reality – maximizing the economics of AI to take data from theory to real-world success. Learn more: ai.intel.com

Sat 5:20 a.m. - 5:30 a.m. [iCal]
Opening Remarks (Introduction)
Roy Fox
Sat 5:30 a.m. - 6:00 a.m. [iCal]
Control as Inference and Soft Deep RL (Sergey Levine) (Invited Talk)
Sergey Levine
Sat 6:00 a.m. - 6:10 a.m. [iCal]
Unsupervised Learning of Image Embedding for Continuous Control (Carlos Florensa) (Contributed Talk)
Carlos Florensa
Sat 6:10 a.m. - 6:20 a.m. [iCal]
Variational Inference Techniques for Sequential Decision Making in Generative Models (Igor Kiselev) (Contributed Talk)
Igor Kiselev
Sat 6:20 a.m. - 6:30 a.m. [iCal]
Probabilistic Planning with Sequential Monte Carlo (Alexandre Piché) (Contributed Talk)
Alexandre Piche
Sat 6:30 a.m. - 7:00 a.m. [iCal]
Inference and control of rules in human hierarchical reinforcement learning (Anne Collins) (Invited Talk)
Anne Collins
Sat 7:00 a.m. - 7:30 a.m. [iCal]
Hierarchical RL: From Prior Knowledge to Policies (Shie Mannor) (Invited Talk)
Shie Mannor
Sat 7:30 a.m. - 8:00 a.m. [iCal]
-- Coffee Break 1 -- (Break)
Sat 8:00 a.m. - 8:30 a.m. [iCal]
Off-policy Policy Optimization (Dale Schuurmans) (Invited Talk)
Dale Schuurmans
Sat 8:30 a.m. - 8:45 a.m. [iCal]
Spotlights 1 (Spotlights)
Ming-Xu Huang, Hao(Jackson) Cui, Arash Mehrjou, Yaqi Duan, Sharad Vikram, Angelina Wang, Karan Goel, Jonathan Hunt, Zhengwei Wu, Dinghan Shen, Matthew Fellows
Sat 8:45 a.m. - 9:15 a.m. [iCal]
Poster Session 1 (Poster Session)
Kyle H Ambert, Brandon Araki, Xiya Cao, Sungjoon Choi, Hao(Jackson) Cui, Jonas Degrave, Yaqi Duan, Matthew Fellows, Carlos Florensa, Karan Goel, Aditya Gopalan, Ming-Xu Huang, Jonathan Hunt, Cyril Ibrahim, Brian Ichter, Max Igl, Tracy Ke Ke, Igor Kiselev, Anuj Mahajan, Arash Mehrjou, Karl Pertsch, Alexandre Piche, Nick Rhinehart, Thomas Ringstrom, Reaz Russel, Oleh Rybkin, Ion Stoica, Sharad Vikram, Angelina Wang, Ting-Han Wei, Abigail H Wen, I-Chen Wu, Zhengwei Wu, Linhai Xie, Dinghan Shen
Sat 9:15 a.m. - 10:45 a.m. [iCal]
-- Lunch Break -- (Break)
Sat 10:45 a.m. - 11:15 a.m. [iCal]
Solving inference and control problems with the same machinery (Emo Todorov) (Invited Talk)
Emo Todorov
Sat 11:15 a.m. - 11:30 a.m. [iCal]
Spotlights 2 (Spotlights)
Aditya Gopalan, Sungjoon Choi, Thomas Ringstrom, Roy Fox, Jonas Degrave, Xiya Cao, Karl Pertsch, Max Igl, Brian Ichter
Sat 11:30 a.m. - 12:00 p.m. [iCal]
Inference and Control of Learning Behavior in Rodents (Ryan Adams) (Invited Talk)
Ryan Adams
Sat 12:00 p.m. - 12:30 p.m. [iCal]
-- Coffee Break 2 -- (Break)
Sat 12:30 p.m. - 1:00 p.m. [iCal]
On the Value of Knowing What You Don't Know: Learning to Sample and Sampling to Learn for Robot Planning (Leslie Kaelbling) (Invited Talk)
Leslie Kaelbling
Sat 1:00 p.m. - 1:10 p.m. [iCal]
Learning to Plan with Logical Automata (Brandon Araki) (Contributed Talk)
Brandon Araki
Sat 1:10 p.m. - 1:20 p.m. [iCal]
Tight Bayesian Ambiguity Sets for Robust MDPs (Reazul Hasan Russel) (Contributed Talk)
Reaz Russel
Sat 1:20 p.m. - 1:30 p.m. [iCal]
Deep Imitative Models for Flexible Inference, Planning, and Control (Nicholas Rhinehart) (Contributed Talk)
Nick Rhinehart
Sat 1:30 p.m. - 2:00 p.m. [iCal]
Probabilistic Reasoning for Reinforcement Learning (Nicolas Heess) (Invited Talk)
Nicolas Heess
Sat 2:00 p.m. - 3:00 p.m. [iCal]
Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) (Discussion Panel)
Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov, Roy Fox
Sat 3:00 p.m. - 3:30 p.m. [iCal]
Poster Session 2 (Poster Session)

Author Information

Leslie Kaelbling (MIT)
Martin Riedmiller (DeepMind)
Marc Toussaint (Universty Stuttgart)
Igor Mordatch (University of Washington)
Roy Fox (UC Berkeley)

[Roy Fox](http://roydfox.com/) is a postdoc at UC Berkeley working with [Ion Stoica](http://people.eecs.berkeley.edu/~istoica/) in the Real-Time Intelligent Secure Explainable lab ([RISELab](https://rise.cs.berkeley.edu/)), and with [Ken Goldberg](http://goldberg.berkeley.edu/) in the Laboratory for Automation Science and Engineering ([AUTOLAB](http://autolab.berkeley.edu/)). His research interests include reinforcement learning, dynamical systems, information theory, automation, and the connections between these fields. His current research focuses on automatic discovery of hierarchical control structures in deep reinforcement learning and in imitation learning of robotic tasks. Roy holds a MSc in Computer Science from the [Technion](http://www.cs.technion.ac.il/), under the supervision of [Moshe Tennenholtz](http://iew3.technion.ac.il/Home/Users/Moshet.phtml), and a PhD in Computer Science from the [Hebrew University](http://www.cs.huji.ac.il/), under the supervision of [Naftali Tishby](http://www.cs.huji.ac.il/~tishby/). He was an exchange PhD student with [Larry Abbott](http://www.cs.huji.ac.il/~tishby/) and [Liam Paninski](http://www.stat.columbia.edu/~liam/) at the [Center for Theoretical Neuroscience](http://www.neurotheory.columbia.edu/) at Columbia University, and a research intern at Microsoft Research.

Tuomas Haarnoja (UC Berkeley)

More from the Same Authors