( events)
Timezone: »
Workshop
Mon Dec 13 08:55 AM -- 06:00 PM (PST)
Deep Reinforcement Learning
In recent years, the use of deep neural networks as function approximators has enabled researchers to extend reinforcement learning techniques to solve increasingly complex control tasks. The emerging field of deep reinforcement learning has led to remarkable empirical results in rich and varied domains like robotics, strategy games, and multiagent interactions. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help interested researchers outside of the field gain perspective about the current state of the art and potential directions for future contributions.
Welcome and Introduction (Welcoming Notes) | |
Implicit Behavioral Cloning (Oral) | |
Implicit Behavioral Cloning Q&A (Q&A) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Oral) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A (Q&A) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Oral) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation Q&A (Q&A) | |
Benchmarking the Spectrum of Agent Capabilities (Oral) | |
Benchmarking the Spectrum of Agent Capabilities Q&A (Q&A) | |
Invited Talk: Laura Schulz - In praise of folly: Goals, play, and human cognition (Talk) | |
Laura Schulz Talk Q&A (Q&A) | |
Break | |
Opinion Contributed Talk: Wilka Carvalho (Talk) | |
Wilka Carvalho Talk Q&A (Q&A) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Oral) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning Q&A (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision Q&A (Q&A) | |
Invited Talk: George Konidaris - Signal to Symbol (via Skills) (Talk) | |
George Konidaris Talk Q&A (Q&A) | |
Poster Session (in Gather Town) (Poster Session) | |
Opinion Contributed Talk: Sergey Levine (Talk) | |
Sergey Levine Talk Q&A (Q&A) | |
Panel Discussion 1 (Panel Discussion) | |
Invited Talk: Dale Schuurmans - Understanding Deep Value Estimation (Talk) | |
Dale Schuurmans Talk Q&A (Q&A) | |
Break | |
Invited Talk: Karol Hausman - Reinforcement Learning as a Data Sponge (Talk) | |
Karol Hausman Talk Q&A (Q&A) | |
NeurIPS RL Competitions Results Presentations (Presentations) | |
Invited Talk: Kenji Doya - Natural and Artificial Reinforcement Learning (Talk) | |
Kenji Doya Talk Q&A (Q&A) | |
Panel Discussion 2 (Panel Discussion) | |
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives (Poster) | |
Towards Automatic Actor-Critic Solutions to Continuous Control (Poster) | |
Learning Robust Dynamics through Variational Sparse Gating (Poster) | |
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks (Poster) | |
Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization (Poster) | |
Task-Induced Representation Learning (Poster) | |
Stability Analysis in Mixed-Autonomous Traffic with Deep Reinforcement Learning (Poster) | |
Exponential Family Model-Based Reinforcement Learning via Score Matching (Poster) | |
OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning (Poster) | |
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL (Poster) | |
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations (Poster) | |
Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning (Poster) | |
Lifting the veil on hyper-parameters for value-baseddeep reinforcement learning (Poster) | |
Strength Through Diversity: Robust Behavior Learning via Mixture Policies (Poster) | |
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning (Poster) | |
Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives (Poster) | |
Behavior Predictive Representations for Generalization in Reinforcement Learning (Poster) | |
A Family of Cognitively Realistic Parsing Environments for Deep Reinforcement Learning (Poster) | |
Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL (Poster) | |
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies (Poster) | |
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification (Poster) | |
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World Trifinger (Poster) | |
OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion (Poster) | |
Component Transfer Learning for Deep RL Based on Abstract Representations (Poster) | |
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (Poster) | |
Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks (Poster) | |
Exploring through Random Curiosity with General Value Functions (Poster) | |
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization (Poster) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Poster) | |
PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network (Poster) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Poster) | |
A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems (Poster) | |
SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning (Poster) | |
Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling (Poster) | |
Math Programming based Reinforcement Learning for Multi-Echelon Inventory Management (Poster) | |
Attention-based Partial Decoupling of Policy and Value for Generalization in Reinforcement Learning (Poster) | |
Understanding and Preventing Capacity Loss in Reinforcement Learning (Poster) | |
Learning compositional tasks from language instructions (Poster) | |
Imitation Learning from Pixel Observations for Continuous Control (Poster) | |
Learning a Subspace of Policies for Online Adaptation in Reinforcement Learning (Poster) | |
Mismatched No More: Joint Model-Policy Optimization for Model-Based RL (Poster) | |
Deep Reinforcement Learning Explanation via Model Transforms (Poster) | |
Policy Optimization via Optimal Policy Evaluation (Poster) | |
Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments (Poster) | |
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models (Poster) | |
Learning Efficient Multi-Agent Cooperative Visual Exploration (Poster) | |
GPU-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning (Poster) | |
General Characterization of Agents by States they Visit (Poster) | |
Deep RePReL--Combining Planning and Deep RL for acting in relational domains (Poster) | |
StarCraft II Unplugged: Large Scale Offline Reinforcement Learning (Poster) | |
BLAST: Latent Dynamics Models from Bootstrapping (Poster) | |
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation (Poster) | |
Embodiment perspective of reward definition for behavioural homeostasis (Poster) | |
Status-quo policy gradient in Multi-Agent Reinforcement Learning (Poster) | |
A Modern Self-Referential Weight Matrix That Learns to Modify Itself (Poster) | |
Understanding the Effects of Dataset Composition on Offline Reinforcement Learning (Poster) | |
Large Scale Coordination Transfer for Cooperative Multi-Agent Reinforcement Learning (Poster) | |
CoMPS: Continual Meta Policy Search (Poster) | |
Maximum Entropy Model-based Reinforcement Learning (Poster) | |
TempoRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning (Poster) | |
Offline Policy Selection under Uncertainty (Poster) | |
Recurrent Off-policy Baselines for Memory-based Continuous Control (Poster) | |
Bayesian Exploration for Lifelong Reinforcement Learning (Poster) | |
Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks (Poster) | |
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning (Poster) | |
Automatic Curricula via Expert Demonstrations (Poster) | |
Accelerated Deep Reinforcement Learning of Terrain-Adaptive Locomotion Skills (Poster) | |
Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization (Poster) | |
Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method (Poster) | |
Implicitly Regularized RL with Implicit Q-values (Poster) | |
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning (Poster) | |
Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization (Poster) | |
Hindsight Foresight Relabeling for Meta-Reinforcement Learning (Poster) | |
No DICE: An Investigation of the Bias-Variance Tradeoff in Meta-Gradients (Poster) | |
MHER: Model-based Hindsight Experience Replay (Poster) | |
Imitation Learning from Observations under Transition Model Disparity (Poster) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Poster) | |
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation (Poster) | |
Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning (Poster) | |
Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games (Poster) | |
Implicit Behavioral Cloning (Poster) | |
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers (Poster) | |
That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities (Poster) | |
Self-Imitation Learning from Demonstrations (Poster) | |
A Closer Look at Gradient Estimators with Reinforcement Learning as Inference (Poster) | |
Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari (Poster) | |
Grounding Aleatoric Uncertainty in Unsupervised Environment Design (Poster) | |
Should I Run Offline Reinforcement Learning or Behavioral Cloning? (Poster) | |
Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation (Poster) | |
Modern Hopfield Networks for Return Decomposition for Delayed Rewards (Poster) | |
Graph Backup: Data Efficient Backup Exploiting Markovian Data (Poster) | |
URLB: Unsupervised Reinforcement Learning Benchmark (Poster) | |
A Framework for Efficient Robotic Manipulation (Poster) | |
Benchmarking the Spectrum of Agent Capabilities (Poster) | |
Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models (Poster) | |
GrASP: Gradient-Based Affordance Selection for Planning (Poster) | |
Introducing Symmetries to Black Box Meta Reinforcement Learning (Poster) | |
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals (Poster) | |
What Would the Expert $do(\cdot)$?: Causal Imitation Learning (Poster) | |
Policy Gradients Incorporating the Future (Poster) | |
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback (Poster) | |
Cross-Domain Imitation Learning via Optimal Transport (Poster) | |
Learning Value Functions from Undirected State-only Experience (Poster) | |
Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning (Poster) | |
TARGETED ENVIRONMENT DESIGN FROM OFFLINE DATA (Poster) | |
The Information Geometry of Unsupervised Reinforcement Learning (Poster) | |
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning (Poster) | |
Target Entropy Annealing for Discrete Soft Actor-Critic (Poster) | |
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates (Poster) | |
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (Poster) | |
Hierarchical Few-Shot Imitation with Skill Transition Models (Poster) | |
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations (Poster) | |
On the Transferability of Deep-Q Networks (Poster) | |
Distributional Decision Transformer for Offline Hindsight Information Matching (Poster) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Poster) | |
CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery (Poster) | |
Offline Reinforcement Learning with In-sample Q-Learning (Poster) | |
Off-Policy Correction For Multi-Agent Reinforcement Learning (Poster) | |
Long-Term Credit Assignment via Model-based Temporal Shortcuts (Poster) | |
Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium (Poster) | |
Hybrid Imitative Planning with Geometric and Predictive Costs in Offroad Environments (Poster) | |
Skill-based Meta-Reinforcement Learning (Poster) | |
The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks (Poster) | |
Wasserstein Distance Maximizing Intrinsic Control (Poster) | |
A Meta-Gradient Approach to Learning Cooperative Multi-Agent Communication Topology (Poster) | |
Continuous Control With Ensemble Deep Deterministic Policy Gradients (Poster) | |
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (Poster) | |
Discriminator Augmented Model-Based Reinforcement Learning (Poster) | |
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning (Poster) | |
Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning (Poster) | |
Learning Parameterized Task Structure for Generalization to Unseen Entities (Poster) | |
Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks (Poster) | |
An Empirical Study of Non-Uniform Sampling in Off-Policy Reinforcement Learning for Continuous Control (Poster) | |
Fast Inference and Transfer of Compositional Task for Few-shot Task Generalization (Poster) | |
Continuous Control with Action Quantization from Demonstrations (Poster) | |
TransDreamer: Reinforcement Learning with Transformer World Models (Poster) | |
Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning (Poster) | |
Transfer RL across Observation Feature Spaces via Model-Based Regularization (Poster) | |
Block Contextual MDPs for Continual Learning (Poster) | |
Unsupervised Learning of Temporal Abstractions using Slot-based Transformers (Poster) | |
Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay (Poster) | |
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension (Poster) | |
Meta Arcade: A Configurable Environment Suite for Deep Reinforcement Learning and Meta-Learning (Poster) | |
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks (Poster) | |
Generalisation in Lifelong Reinforcement Learning through Logical Composition (Poster) | |
Latent Geodesics of Model Dynamics for Offline Reinforcement Learning (Poster) | |
Expert Human-Level Driving in Gran Turismo Sport Using Deep Reinforcement Learning with Image-based Representation (Poster) | |
Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning (Poster) | |