Workshop
Mon Dec 13 08:55 AM -- 06:00 PM (PST)
Deep Reinforcement Learning
In recent years, the use of deep neural networks as function approximators has enabled researchers to extend reinforcement learning techniques to solve increasingly complex control tasks. The emerging field of deep reinforcement learning has led to remarkable empirical results in rich and varied domains like robotics, strategy games, and multiagent interactions. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help interested researchers outside of the field gain perspective about the current state of the art and potential directions for future contributions.
Welcome and Introduction (Welcoming Notes) | |
Implicit Behavioral Cloning (Oral) | |
Implicit Behavioral Cloning Q&A (Q&A) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Oral) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A (Q&A) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Oral) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation Q&A (Q&A) | |
Benchmarking the Spectrum of Agent Capabilities (Oral) | |
Benchmarking the Spectrum of Agent Capabilities Q&A (Q&A) | |
Invited Talk: Laura Schulz - In praise of folly: Goals, play, and human cognition (Talk) | |
Laura Schulz Talk Q&A (Q&A) | |
Break | |
Opinion Contributed Talk: Wilka Carvalho (Talk) | |
Wilka Carvalho Talk Q&A (Q&A) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Oral) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning Q&A (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision Q&A (Q&A) | |
Invited Talk: George Konidaris - Signal to Symbol (via Skills) (Talk) | |
George Konidaris Talk Q&A (Q&A) | |
Poster Session (in Gather Town) (Poster Session) | |
Opinion Contributed Talk: Sergey Levine (Talk) | |
Sergey Levine Talk Q&A (Q&A) | |
Panel Discussion 1 (Panel Discussion) | |
Invited Talk: Dale Schuurmans - Understanding Deep Value Estimation (Talk) | |
Dale Schuurmans Talk Q&A (Q&A) | |
Break | |
Invited Talk: Karol Hausman - Reinforcement Learning as a Data Sponge (Talk) | |
Karol Hausman Talk Q&A (Q&A) | |
NeurIPS RL Competitions Results Presentations (Presentations) | |
Invited Talk: Kenji Doya - Natural and Artificial Reinforcement Learning (Talk) | |
Kenji Doya Talk Q&A (Q&A) | |
Panel Discussion 2 (Panel Discussion) | |
Task-Induced Representation Learning (Poster) | |
No DICE: An Investigation of the Bias-Variance Tradeoff in Meta-Gradients (Poster) | |
Block Contextual MDPs for Continual Learning (Poster) | |
PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network (Poster) | |
URLB: Unsupervised Reinforcement Learning Benchmark (Poster) | |
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks (Poster) | |
Strength Through Diversity: Robust Behavior Learning via Mixture Policies (Poster) | |
Long-Term Credit Assignment via Model-based Temporal Shortcuts (Poster) | |
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks (Poster) | |
Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari (Poster) | |
Implicit Behavioral Cloning (Poster) | |
Policy Gradients Incorporating the Future (Poster) | |
TempoRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning (Poster) | |
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning (Poster) | |
Exploring through Random Curiosity with General Value Functions (Poster) | |
Maximum Entropy Model-based Reinforcement Learning (Poster) | |
Exponential Family Model-Based Reinforcement Learning via Score Matching (Poster) | |
Imitation Learning from Pixel Observations for Continuous Control (Poster) | |
Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning (Poster) | |
Latent Geodesics of Model Dynamics for Offline Reinforcement Learning (Poster) | |
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension (Poster) | |
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback (Poster) | |
The Information Geometry of Unsupervised Reinforcement Learning (Poster) | |
Mismatched No More: Joint Model-Policy Optimization for Model-Based RL (Poster) | |
Graph Backup: Data Efficient Backup Exploiting Markovian Data (Poster) | |
Modern Hopfield Networks for Return Decomposition for Delayed Rewards (Poster) | |
Learning Efficient Multi-Agent Cooperative Visual Exploration (Poster) | |
Learning compositional tasks from language instructions (Poster) | |
Deep Reinforcement Learning Explanation via Model Transforms (Poster) | |
A Family of Cognitively Realistic Parsing Environments for Deep Reinforcement Learning (Poster) | |
OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion (Poster) | |
CoMPS: Continual Meta Policy Search (Poster) | |
Expert Human-Level Driving in Gran Turismo Sport Using Deep Reinforcement Learning with Image-based Representation (Poster) | |
MHER: Model-based Hindsight Experience Replay (Poster) | |
On the Transferability of Deep-Q Networks (Poster) | |
A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems (Poster) | |
Math Programming based Reinforcement Learning for Multi-Echelon Inventory Management (Poster) | |
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World Trifinger (Poster) | |
Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL (Poster) | |
Automatic Curricula via Expert Demonstrations (Poster) | |
A Closer Look at Gradient Estimators with Reinforcement Learning as Inference (Poster) | |
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation (Poster) | |
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification (Poster) | |
Generalisation in Lifelong Reinforcement Learning through Logical Composition (Poster) | |
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations (Poster) | |
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates (Poster) | |
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers (Poster) | |
Target Entropy Annealing for Discrete Soft Actor-Critic (Poster) | |
Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning (Poster) | |
StarCraft II Unplugged: Large Scale Offline Reinforcement Learning (Poster) | |
Status-quo policy gradient in Multi-Agent Reinforcement Learning (Poster) | |
Deep RePReL--Combining Planning and Deep RL for acting in relational domains (Poster) | |
Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling (Poster) | |
Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives (Poster) | |
Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization (Poster) | |
Distributional Decision Transformer for Offline Hindsight Information Matching (Poster) | |
Transfer RL across Observation Feature Spaces via Model-Based Regularization (Poster) | |
Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method (Poster) | |
General Characterization of Agents by States they Visit (Poster) | |
A Modern Self-Referential Weight Matrix That Learns to Modify Itself (Poster) | |
Understanding the Effects of Dataset Composition on Offline Reinforcement Learning (Poster) | |
Benchmarking the Spectrum of Agent Capabilities (Poster) | |
Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning (Poster) | |
Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks (Poster) | |
A Framework for Efficient Robotic Manipulation (Poster) | |
Imitation Learning from Observations under Transition Model Disparity (Poster) | |
Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization (Poster) | |
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation (Poster) | |
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals (Poster) | |
Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks (Poster) | |
Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning (Poster) | |
Should I Run Offline Reinforcement Learning or Behavioral Cloning? (Poster) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Poster) | |
Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning (Poster) | |
Offline Reinforcement Learning with In-sample Q-Learning (Poster) | |
Wasserstein Distance Maximizing Intrinsic Control (Poster) | |
TARGETED ENVIRONMENT DESIGN FROM OFFLINE DATA (Poster) | |
GPU-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning (Poster) | |
Behavior Predictive Representations for Generalization in Reinforcement Learning (Poster) | |
An Empirical Study of Non-Uniform Sampling in Off-Policy Reinforcement Learning for Continuous Control (Poster) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Poster) | |
Unsupervised Learning of Temporal Abstractions using Slot-based Transformers (Poster) | |
Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium (Poster) | |
Stability Analysis in Mixed-Autonomous Traffic with Deep Reinforcement Learning (Poster) | |
Accelerated Deep Reinforcement Learning of Terrain-Adaptive Locomotion Skills (Poster) | |
Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization (Poster) | |
Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay (Poster) | |
A Meta-Gradient Approach to Learning Cooperative Multi-Agent Communication Topology (Poster) | |
Continuous Control with Action Quantization from Demonstrations (Poster) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Poster) | |
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL (Poster) | |
Self-Imitation Learning from Demonstrations (Poster) | |
Understanding and Preventing Capacity Loss in Reinforcement Learning (Poster) | |
BLAST: Latent Dynamics Models from Bootstrapping (Poster) | |
Learning Robust Dynamics through Variational Sparse Gating (Poster) | |
Fast Inference and Transfer of Compositional Task for Few-shot Task Generalization (Poster) | |
Bayesian Exploration for Lifelong Reinforcement Learning (Poster) | |
Offline Policy Selection under Uncertainty (Poster) | |
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies (Poster) | |
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization (Poster) | |
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (Poster) | |
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations (Poster) | |
CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery (Poster) | |
SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning (Poster) | |
OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning (Poster) | |
GrASP: Gradient-Based Affordance Selection for Planning (Poster) | |
Recurrent Off-policy Baselines for Memory-based Continuous Control (Poster) | |
Embodiment perspective of reward definition for behavioural homeostasis (Poster) | |
Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games (Poster) | |
That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities (Poster) | |
Large Scale Coordination Transfer for Cooperative Multi-Agent Reinforcement Learning (Poster) | |
Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments (Poster) | |
Skill-based Meta-Reinforcement Learning (Poster) | |
Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models (Poster) | |
Hybrid Imitative Planning with Geometric and Predictive Costs in Offroad Environments (Poster) | |
Discriminator Augmented Model-Based Reinforcement Learning (Poster) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Poster) | |
Introducing Symmetries to Black Box Meta Reinforcement Learning (Poster) | |
Component Transfer Learning for Deep RL Based on Abstract Representations (Poster) | |
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives (Poster) | |
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (Poster) | |
Implicitly Regularized RL with Implicit Q-values (Poster) | |
Towards Automatic Actor-Critic Solutions to Continuous Control (Poster) | |
Hierarchical Few-Shot Imitation with Skill Transition Models (Poster) | |
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning (Poster) | |
Policy Optimization via Optimal Policy Evaluation (Poster) | |
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (Poster) | |
Attention-based Partial Decoupling of Policy and Value for Generalization in Reinforcement Learning (Poster) | |
Learning Value Functions from Undirected State-only Experience (Poster) | |
The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks (Poster) | |
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning (Poster) | |
Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning (Poster) | |
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning (Poster) | |
Off-Policy Correction For Multi-Agent Reinforcement Learning (Poster) | |
Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation (Poster) | |
Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks (Poster) | |
Cross-Domain Imitation Learning via Optimal Transport (Poster) | |
Lifting the veil on hyper-parameters for value-baseddeep reinforcement learning (Poster) | |
TransDreamer: Reinforcement Learning with Transformer World Models (Poster) | |
Learning Parameterized Task Structure for Generalization to Unseen Entities (Poster) | |
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models (Poster) | |
Learning a Subspace of Policies for Online Adaptation in Reinforcement Learning (Poster) | |
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning (Poster) | |
Meta Arcade: A Configurable Environment Suite for Deep Reinforcement Learning and Meta-Learning (Poster) | |
Hindsight Foresight Relabeling for Meta-Reinforcement Learning (Poster) | |
Continuous Control With Ensemble Deep Deterministic Policy Gradients (Poster) | |
What Would the Expert $do(\cdot)$?: Causal Imitation Learning (Poster) | |
Grounding Aleatoric Uncertainty in Unsupervised Environment Design (Poster) | |