`

Timezone: »

 
Workshop
Deep Reinforcement Learning
Pieter Abbeel · Chelsea Finn · Joelle Pineau · David Silver · Satinder Singh · Coline Devin · Misha Laskin · Kimin Lee · Janarthanan Rajendran · Vivek Veeriah

Fri Dec 11 08:30 AM -- 07:00 PM (PST) @ None
Event URL: https://sites.google.com/view/deep-rl-workshop-neurips2020/home »

In recent years, the use of deep neural networks as function approximators has enabled researchers to extend reinforcement learning techniques to solve increasingly complex control tasks. The emerging field of deep reinforcement learning has led to remarkable empirical results in rich and varied domains like robotics, strategy games, and multiagent interactions. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help interested researchers outside of the field gain a high-level view about the current state of the art and potential directions for future contributions.

Fri 8:30 a.m. - 9:00 a.m.
Invited talk: PierreYves Oudeyer "Machines that invent their own problems: Towards open-ended learning of skills" (Talk)   
Pierre-Yves Oudeyer
Fri 9:00 a.m. - 9:15 a.m.
Contributed Talk: Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning (Talk)   
Sammy Christen, Lukas Jendele, Emre Aksan, Otmar Hilliges
Fri 9:15 a.m. - 9:30 a.m.
Contributed Talk: Maximum Reward Formulation In Reinforcement Learning (Talk)   
Sai Krishna Gottipati, Yashaswi Pathak, Rohan Nuttall, Sahir ., Ravi Chunduru, Ahmed Touati, Sriram Ganapathi, Matthew Taylor , Sarath Chandar
Fri 9:30 a.m. - 9:45 a.m.
Contributed Talk: Accelerating Reinforcement Learning with Learned Skill Priors (Talk)   
Karl Pertsch, Youngwoon Lee, Joseph Lim
Fri 9:45 a.m. - 10:00 a.m.
Contributed Talk: Asymmetric self-play for automatic goal discovery in robotic manipulation (Talk)   
OpenAI Robotics, Matthias Plappert, Raul Sampedro, Tao Xu , Ilge Akkaya, Vineet Kosaraju, Peter Welinder, Ruben D'Sa, Arthur Petron, Henrique Ponde, Alex Paino, Hyeonwoo Noh  Noh , Lilian Weng, Qiming Yuan, Casey Chu , Wojciech Zaremba
Fri 10:00 a.m. - 10:30 a.m.
Invited talk: Marc Bellemare "Autonomous navigation of stratospheric balloons using reinforcement learning" (Talk)
Marc Bellemare
Fri 10:30 a.m. - 11:00 a.m.
Break
Fri 11:00 a.m. - 11:30 a.m.
  

For autonomous robots to operate in the open, dynamically changing world, they will need to be able to learn a robust set of skills from relatively little experience. This talk introduces Grounded Simulation Learning as a way to bridge the so-called reality gap between simulators and the real world in order to enable transfer learning from simulation to a real robot. Grounded Simulation Learning has led to the fastest known stable walk on a widely used humanoid robot. Connections to theoretical advances in off-policy reinforcement learning will be highlighted.

Peter Stone
Fri 11:30 a.m. - 11:45 a.m.
Contributed Talk: Mirror Descent Policy Optimization (Talk)   
Manan Tomar, Lior Shani, Yonathan Efroni, Mohammad Ghavamzadeh
Fri 11:45 a.m. - 12:00 p.m.
Contributed Talk: Planning from Pixels using Inverse Dynamics Models (Talk)   
Keiran Paster, Sheila McIlraith, Jimmy Ba
Fri 12:00 p.m. - 12:30 p.m.
Invited talk: Matt Botvinick "Alchemy: A Benchmark Task Distribution for Meta-Reinforcement Learning Research" (Talk)   
Matt Botvinick
Fri 12:30 p.m. - 1:30 p.m.
Poster session 1 (Poster session)  link »
Fri 1:30 p.m. - 2:00 p.m.
  

Digital Healthcare is a growing area of importance in modern healthcare due to its potential in helping individuals improve their behaviors so as to better manage chronic health challenges such as hypertension, mental health, cancer and so on. Digital apps and wearables, observe the user's state via sensors/self-report, deliver treatment actions (reminders, motivational messages, suggestions, social outreach,...) and observe rewards repeatedly on the user across time. This area is seeing increasing interest by RL researchers with the goal of including in the digital app/wearable an RL algorithm that "personalizes" the treatments to the user. But after RL is run on a number of users, how do we know whether the RL algorithm actually personalized the sequential treatments to the user? In this talk we report on our first efforts to address this question after our RL algorithm was deployed on each of 111 individuals with hypertension.

Susan Murphy
Fri 2:00 p.m. - 2:15 p.m.
Contributed Talk: MaxEnt RL and Robust Control (Talk)   
Benjamin Eysenbach, Sergey Levine
Fri 2:15 p.m. - 2:30 p.m.
Contributed Talk: Reset-Free Lifelong Learning with Skill-Space Planning (Talk)   
Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
Fri 2:30 p.m. - 3:00 p.m.
  

Deep learning has shown promising results in robotics, but we are still far from having intelligent systems that can operate in the unstructured settings of the real world, where disturbances, variations, and unobserved factors lead to a dynamic environment. In this talk, we'll see that model-based deep RL can indeed allow for efficient skill acquisition, as well as the ability to repurpose models to solve a variety of tasks. We'll scale up these approaches to enable locomotion with a 6-DoF legged robot on varying terrains in the real world, as well as dexterous manipulation with a 24-DoF anthropomorphic hand in the real world. We then focus on the inevitable mismatch between an agent's training conditions and the test conditions in which it may actually be deployed, thus illuminating the need for adaptive systems. Inspired by the ability of humans and animals to adapt quickly in the face of unexpected changes, we present a meta-learning algorithm within this model-based RL framework to enable online adaptation of large, high-capacity models using only small amounts of data from the new task. These fast adaptation capabilities are seen in both simulation and the real-world, with experiments such as a 6-legged robot adapting online to an unexpected payload or suddenly losing a leg. We will then further extend the capabilities of our robotic systems by enabling the agents to reason directly from raw image observations. Bridging the benefits of representation learning techniques with the adaptation capabilities of meta-RL, we'll present a unified framework for effective meta-RL from images. With robotic arms in the real world that learn peg insertion and ethernet cable insertion to varying targets, we'll see the fast acquisition of new skills, directly from raw image observations in the real world. Finally, this talk will conclude that model-based deep RL provides a framework for making sense of the world, thus allowing for reasoning and adaptation capabilities that are necessary for successful operation in the dynamic settings of the real world.

Anusha Nagabandi
Fri 3:00 p.m. - 3:30 p.m.
Break
Fri 3:30 p.m. - 4:00 p.m.
  

A common trope in sci-fi is to have a robot that can quickly solve some problem after watching a person, studying a video, or reading a book. While these settings are (currently) fictional, the benefits are real. Agents that can solve tasks by observing others have the potential to greatly reduce the burden of their human teachers, removing some of the need to hand-specify rewards or goals. In this talk, I consider the question of how an agent can not only learn by observing others, but also how it can learn quickly by training offline before taking any steps in the environment. First, I will describe an approach that trains a latent policy directly from state observations, which can then be quickly mapped to real actions in the agent’s environment. Then I will describe how we can train a novel value function, Q(s,s’), to learn off-policy from observations. Unlike previous imitation from observation approaches, this formulation goes beyond simply imitating and rather enables learning from potentially suboptimal observations.

Ashley Edwards
Fri 4:00 p.m. - 4:07 p.m.
NeurIPS RL Competitions: Flatland challenge (Talk)   
Sharada Mohanty
Fri 4:07 p.m. - 4:15 p.m.
NeurIPS RL Competitions: Learning to run a power network (Talk)   
Antoine Marot
Fri 4:15 p.m. - 4:22 p.m.
NeurIPS RL Competitions: Procgen challenge (Talk)
Sharada Mohanty
Fri 4:22 p.m. - 4:30 p.m.
NeurIPS RL Competitions: MineRL (Talk)   
William Guss, Stephanie Milani
Fri 4:30 p.m. - 5:00 p.m.
  

Creating realistic virtual humans has traditionally been considered a research problem in Computer Animation primarily for entertainment applications. With the recent breakthrough in collaborative robots and deep reinforcement learning, accurately modeling human movements and behaviors has become a common challenge also faced by researchers in robotics and artificial intelligence. For example, mobile robots and autonomous vehicles can benefit from training in environments populated with ambulating humans and learning to avoid colliding with them. Healthcare robotics, on the other hand, need to embrace physical contacts and learn to utilize them for enabling human’s activities of daily living. An immediate concern in developing such an autonomous and powered robotic device is the safety of human users during the early development phase when the control policies are still largely suboptimal. Learning from physically simulated humans and environments presents a promising alternative which enables robots to safely make and learn from mistakes without putting real people at risk. However, deploying such policies to interact with people in the real world adds additional complexity to the already challenging sim-to-real transfer problem. In this talk, I will present our current progress on solving the problem of sim-to-real transfer with humans in the environment, actively interacting with the robots through physical contacts. We tackle the problem from two fronts: developing more relevant human models to facilitate robot learning and developing human-aware robot perception and control policies. As an example of contextualizing our research effort, we develop a mobile manipulator to put clothes on people with physical impairments, enabling them to carry out day-to-day tasks and maintain independence.

Karen Liu
Fri 5:00 p.m. - 6:00 p.m.
Panel discussion
Pierre-Yves Oudeyer, Marc Bellemare, Peter Stone, Matt Botvinick, Susan Murphy, Anusha Nagabandi, Ashley Edwards, Karen Liu, Pieter Abbeel
Fri 6:00 p.m. - 7:00 p.m.
Poster session 2 (Poster session)  link »
-
Poster: Planning from Pixels using Inverse Dynamics Models (Poster)   
-
Poster: OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning (Poster)   
-
Poster: Maximum Reward Formulation In Reinforcement Learning (Poster)   
-
Poster: Reset-Free Lifelong Learning with Skill-Space Planning (Poster)   
-
Poster: Mirror Descent Policy Optimization (Poster)   
-
Poster: MaxEnt RL and Robust Control (Poster)   
-
Poster: Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning (Poster)   
-
Poster: Provably Efficient Policy Optimization via Thompson Sampling (Poster)   
-
Poster: Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates (Poster)   
-
Poster: Efficient Competitive Self-Play Policy Optimization (Poster)   
-
Poster: Asymmetric self-play for automatic goal discovery in robotic manipulation (Poster)   
-
Poster: Correcting Momentum in Temporal Difference Learning (Poster)   
-
Poster: Decoupling Exploration and Exploitation in Meta-Reinforcement Learning without Sacrifices (Poster)   
-
Poster: Diverse Exploration via InfoMax Options (Poster)   
-
Poster: Model-Based Meta-Reinforcement Learning for Flight with Suspended Payloads (Poster)
-
Poster: Parrot: Data-driven Behavioral Priors for Reinforcement Learning (Poster)   
-
Poster: C-Learning: Horizon-Aware Cumulative Accessibility Estimation (Poster)   
-
Poster: Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning (Poster)
-
Poster: Data-Efficient Reinforcement Learning with Self-Predictive Representations (Poster)   
-
Poster: Accelerating Reinforcement Learning with Learned Skill Priors (Poster)   
-
Poster: C-Learning: Learning to Achieve Goals via Recursive Classification (Poster)   
-
Poster: Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers (Poster)   
-
Poster: Learning to Reach Goals via Iterated Supervised Learning (Poster)   
-
Poster: Unified View of Inference-based Off-policy RL: Decoupling Algorithmic and Implemental Source of Performance Gaps (Poster)   
-
Poster: Learning to Sample with Local and Global Contexts in Experience Replay Buffer (Poster)   
-
Poster: Adversarial Environment Generation for Learning to Navigate the Web (Poster)   
-
Poster: Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments (Poster)   
-
Poster: DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies (Poster)   
-
Poster: Discovery of Options via Meta-Gradients (Poster)   
-
Poster: GRAC: Self-Guided and Self-Regularized Actor-Critic (Poster)   
-
Poster: Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity (Poster)   
-
Poster: Deep Bayesian Quadrature Policy Gradient (Poster)   
-
Poster: PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards (Poster)   
-
Poster: A Policy Gradient Method for Task-Agnostic Exploration (Poster)   
-
Poster: Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning (Poster)   
-
Poster: Skill Transfer via Partially Amortized Hierarchical Planning (Poster)   
-
Poster: On Effective Parallelization of Monte Carlo Tree Search (Poster)   
-
Poster: Mastering Atari with Discrete World Models (Poster)
-
Poster: Average Reward Reinforcement Learning with Monotonic Policy Improvement (Poster)   
-
Poster: Combating False Negatives in Adversarial Imitation Learning (Poster)   
-
Poster: Evaluating Agents Without Rewards (Poster)   
-
Poster: Learning Latent Landmarks for Generalizable Planning (Poster)   
-
Poster: Conservative Safety Critics for Exploration (Poster)   
-
Poster: Solving Compositional Reinforcement Learning Problems via Task Reduction (Poster)   
-
Poster: Deep Q-Learning with Low Switching Cost (Poster)   
-
Poster: Learning to Represent Action Values as a Hypergraph on the Action Vertices (Poster)   
-
Poster: Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets (Poster)   
-
Poster: TACTO: A Simulator for Learning Control from Touch Sensing (Poster)   
-
Poster: Safe Reinforcement Learning with Natural Language Constraints (Poster)   
-
Poster: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks (Poster)   
-
Poster: An Examination of Preference-based Reinforcement Learning for Treatment Recommendation (Poster)   
-
Poster: Model-based Navigation in Environments with Novel Layouts Using Abstract $n$-D Maps (Poster)   
-
Poster: Online Safety Assurance for Deep Reinforcement Learning (Poster)   
-
Poster: Lyapunov Barrier Policy Optimization (Poster)   
-
Poster: Evolving Reinforcement Learning Algorithms (Poster)   
-
Poster: Chaining Behaviors from Data with Model-Free Reinforcement Learning (Poster)   
-
Poster: Pairwise Weights for Temporal Credit Assignment (Poster)   
-
Poster: Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning (Poster)   
-
Poster: Understanding Learned Reward Functions (Poster)   
-
Poster: Addressing reward bias in Adversarial Imitation Learning with neutral reward functions (Poster)   
-
Poster: Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples (Poster)   
-
Poster: Decoupling Representation Learning from Reinforcement Learning (Poster)
-
Poster: Model-Based Reinforcement Learning via Latent-Space Collocation (Poster)   
-
Poster: A Variational Inference Perspective on Goal-Directed Behavior in Reinforcement Learning (Poster)   
-
Poster: SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II (Poster)   
-
Poster: Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning (Poster)   
-
Poster: Latent State Models for Meta-Reinforcement Learning from Images (Poster)   
-
Poster: Dream and Search to Control: Latent Space Planning for Continuous Control (Poster)   
-
Poster: Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning (Poster)   
-
Poster: Goal-Conditioned Reinforcement Learning in the Presence of an Adversary (Poster)   
-
Poster: Regularized Inverse Reinforcement Learning (Poster)   
-
Poster: Domain Adversarial Reinforcement Learning (Poster)   
-
Poster: Safety Aware Reinforcement Learning (Poster)   
-
Poster: Sample Efficient Training in Multi-Agent AdversarialGames with Limited Teammate Communication (Poster)   
-
Poster: Amortized Variational Deep Q Network (Poster)   
-
Poster: Disentangled Planning and Control in Vision Based Robotics via Reward Machines (Poster)   
-
Poster: Maximum Mutation Reinforcement Learning for Scalable Control (Poster)   
-
Poster: Unsupervised Task Clustering for Multi-Task Reinforcement Learning (Poster)   
-
Poster: Learning Intrinsic Symbolic Rewards in Reinforcement Learning (Poster)   
-
Poster: Preventing Value Function Collapse in Ensemble Q-Learning by Maximizing Representation Diversity (Poster)   
-
Poster: Action and Perception as Divergence Minimization (Poster)   
-
Poster: Randomized Ensembled Double Q-Learning: Learning Fast Without a Model (Poster)   
-
Poster: D2RL: Deep Dense Architectures in Reinforcement Learning (Poster)   
-
Poster: Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms (Poster)   
-
Poster: Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization (Poster)   
-
Poster: What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study (Poster)   
-
Poster: Semantic State Representation for Reinforcement Learning (Poster)   
-
Poster: Hyperparameter Auto-tuning in Self-Supervised Robotic Learning (Poster)   
-
Poster: Targeted Query-based Action-Space Adversarial Policies on Deep Reinforcement Learning Agents (Poster)   
-
Poster: Abstract Value Iteration for Hierarchical Deep Reinforcement Learning (Poster)   
-
Poster: Compute- and Memory-Efficient Reinforcement Learning with Latent Experience Replay (Poster)   
-
Poster: Emergent Road Rules In Multi-Agent Driving Environments (Poster)   
-
Poster: An Algorithmic Causal Model of Credit Assignment in Reinforcement Learning (Poster)   
-
Poster: Learning to Weight Imperfect Demonstrations (Poster)   
-
Poster: Structure and randomness in planning and reinforcement learning (Poster)   
-
Poster: Parameter-based Value Functions (Poster)   
-
Poster: Influence-aware Memory for Deep Reinforcement Learning in POMDPs (Poster)   
-
Poster: Modular Training, Integrated Planning Deep Reinforcement Learning for Mobile Robot Navigation (Poster)   
-
Poster: How to make Deep RL work in Practice (Poster)   
-
Poster: Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning (Poster)   
-
Poster: Which Mutual-Information Representation Learning Objectives are Sufficient for Control? (Poster)   
-
Poster: Curriculum Learning through Distilled Discriminators (Poster)   
-
Poster: Self-Supervised Policy Adaptation during Deployment (Poster)   
-
Poster: Trust, but verify: model-based exploration in sparse reward environments (Poster)   
-
Poster: Optimizing Traffic Bottleneck Throughput using Cooperative, Decentralized Autonomous Vehicles (Poster)   
-
Poster: Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking (Poster)   
-
Poster: Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research (Poster)   
-
Poster: Reinforcement Learning with Latent Flow (Poster)   
-
Poster: Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization (Poster)   
-
Poster: AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (Poster)   
-
Poster: Inter-Level Cooperation in Hierarchical Reinforcement Learning (Poster)   
-
Poster: Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning (Poster)   
-
Poster: Multi-Agent Option Critic Architecture (Poster)
-
Poster: Measuring Visual Generalization in Continuous Control from Pixels (Poster)   
-
Poster: Policy Learning Using Weak Supervision (Poster)   
-
Poster: Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (Poster)
-
Poster: Unsupervised Domain Adaptation for Visual Navigation (Poster)   
-
Poster: Learning Markov State Abstractions for Deep Reinforcement Learning (Poster)   
-
Poster: Value Generalization among Policies: Improving Value Function with Policy Representation (Poster)   
-
Poster: Energy-based Surprise Minimization for Multi-Agent Value Factorization (Poster)   
-
Poster: Backtesting Optimal Trade Execution Policies in Agent-Based Market Simulator (Poster)   
-
Poster: Successor Landmarks for Efficient Exploration and Long-Horizon Navigation (Poster)   
-
Poster: Multi-task Reinforcement Learning with a Planning Quasi-Metric (Poster)   
-
Poster: R-LAtte: Visual Control via Deep Reinforcement Learning with Attention Network (Poster)   
-
Poster: Quantifying Differences in Reward Functions (Poster)   
-
Poster: DERAIL: Diagnostic Environments for Reward And Imitation Learning (Poster)   
-
Poster: Exploring Zero-Shot Emergent Communication in Embodied Multi-Agent Populations (Poster)   
-
Poster: Unlocking the Potential of Deep Counterfactual Value Networks (Poster)   
-
Poster: FactoredRL: Leveraging Factored Graphs for Deep Reinforcement Learning (Poster)   
-
Poster: Reusability and Transferability of Macro Actions for Reinforcement Learning (Poster)   
-
Poster: Interactive Visualization for Debugging RL (Poster)   
-
Poster: A Deep Value-based Policy Search Approach for Real-world Vehicle Repositioning on Mobility-on-Demand Platforms (Poster)   
-
Poster: FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance (Poster)   
-
Poster: Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks (Poster)   
-
Poster: Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning (Poster)   
-
Poster: XLVIN: eXecuted Latent Value Iteration Nets (Poster)   
-
Poster: Beyond Exponentially Discounted Sum: Automatic Learning of Return Function (Poster)   
-
Poster: XT2: Training an X-to-Text Typing Interface with Online Learning from Implicit Feedback (Poster)   
-
Poster: Greedy Multi-Step Off-Policy Reinforcement Learning (Poster)   
-
Poster: Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning (Poster)   
-
Poster: Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation (Poster)   
-
Poster: ReaPER: Improving Sample Efficiency in Model-Based Latent Imagination (Poster)   
-
Poster: Model-Based Reinforcement Learning: A Compressed Survey (Poster)   
-
Poster: BeBold: Exploration Beyond the Boundary of Explored Regions (Poster)   
-
Poster: Model-Based Visual Planning with Self-Supervised Functional Distances (Poster)   
-
Poster: Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning (Poster)   
-
Poster: Utilizing Skipped Frames in Action Repeats via Pseudo-Actions (Poster)   
-
Poster: Bringing order into Actor-Critic Algorithms usingStackelberg Games (Poster)
-
Poster: Continual Model-Based Reinforcement Learning withHypernetworks (Poster)   
-
Poster: Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies (Poster)   
-
Poster: Policy Guided Planning in Learned Latent Space (Poster)   
-
Poster: PettingZoo: Gym for Multi-Agent Reinforcement Learning (Poster)   
-
Poster: DREAM: Deep Regret minimization with Advantage baselines and Model-free learning (Poster)   

Author Information

Pieter Abbeel (UC Berkeley & Covariant)

Pieter Abbeel is Professor and Director of the Robot Learning Lab at UC Berkeley [2008- ], Co-Director of the Berkeley AI Research (BAIR) Lab, Co-Founder of covariant.ai [2017- ], Co-Founder of Gradescope [2014- ], Advisor to OpenAI, Founding Faculty Partner AI@TheHouse venture fund, Advisor to many AI/Robotics start-ups. He works in machine learning and robotics. In particular his research focuses on making robots learn from people (apprenticeship learning), how to make robots learn through their own trial and error (reinforcement learning), and how to speed up skill acquisition through learning-to-learn (meta-learning). His robots have learned advanced helicopter aerobatics, knot-tying, basic assembly, organizing laundry, locomotion, and vision-based robotic manipulation. He has won numerous awards, including best paper awards at ICML, NIPS and ICRA, early career awards from NSF, Darpa, ONR, AFOSR, Sloan, TR35, IEEE, and the Presidential Early Career Award for Scientists and Engineers (PECASE). Pieter's work is frequently featured in the popular press, including New York Times, BBC, Bloomberg, Wall Street Journal, Wired, Forbes, Tech Review, NPR.

Chelsea Finn (Stanford)
Joelle Pineau (McGill University)

Joelle Pineau is an Associate Professor and William Dawson Scholar at McGill University where she co-directs the Reasoning and Learning Lab. She also leads the Facebook AI Research lab in Montreal, Canada. She holds a BASc in Engineering from the University of Waterloo, and an MSc and PhD in Robotics from Carnegie Mellon University. Dr. Pineau's research focuses on developing new models and algorithms for planning and learning in complex partially-observable domains. She also works on applying these algorithms to complex problems in robotics, health care, games and conversational agents. She serves on the editorial board of the Journal of Artificial Intelligence Research and the Journal of Machine Learning Research and is currently President of the International Machine Learning Society. She is a recipient of NSERC's E.W.R. Steacie Memorial Fellowship (2018), a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), a Senior Fellow of the Canadian Institute for Advanced Research (CIFAR) and in 2016 was named a member of the College of New Scholars, Artists and Scientists by the Royal Society of Canada.

David Silver (DeepMind)
Satinder Singh (University of Michigan)
Coline Devin (DeepMind)
Misha Laskin (UC Berkeley)
Kimin Lee (UC Berkeley)
Janarthanan Rajendran (University of Michigan)
Vivek Veeriah (University of Michigan)

More from the Same Authors