firstbacksecondback
12 Results
Poster
|
Wed 9:00 |
Exploration via Elliptical Episodic Bonuses Mikael Henaff · Roberta Raileanu · Minqi Jiang · Tim Rocktäschel |
|
Poster
|
Constrained Update Projection Approach to Safe Policy Optimization Long Yang · Jiaming Ji · Juntao Dai · Linrui Zhang · Binbin Zhou · Pengfei Li · Yaodong Yang · Gang Pan |
||
Poster
|
Tue 9:00 |
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes Zhihan Xiong · Ruoqi Shen · Qiwen Cui · Maryam Fazel · Simon Du |
|
Poster
|
Thu 14:00 |
Provably Efficient Model-Free Constrained RL with Linear Function Approximation Arnob Ghosh · Xingyu Zhou · Ness Shroff |
|
Poster
|
Finite-Time Analysis of Adaptive Temporal Difference Learning with Deep Neural Networks Tao Sun · Dongsheng Li · Bao Wang |
||
Poster
|
Thu 9:00 |
DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning Archana Bura · Aria HasanzadeZonuzy · Dileep Kalathil · Srinivas Shakkottai · Jean-Francois Chamberland |
|
Poster
|
Tue 9:00 |
Continuous MDP Homomorphisms and Homomorphic Policy Gradient Sahand Rezaei-Shoshtari · Rosie Zhao · Prakash Panangaden · David Meger · Doina Precup |
|
Poster
|
Wed 14:00 |
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Tiancheng Jin · Tal Lancewicki · Haipeng Luo · Yishay Mansour · Aviv Rosenberg |
|
Poster
|
Wed 9:00 |
Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs Yeoneung Kim · Insoon Yang · Kwang-Sung Jun |
|
Poster
|
Provable General Function Class Representation Learning in Multitask Bandits and MDP Rui Lu · Andrew Zhao · Simon Du · Gao Huang |
||
Workshop
|
Online Policy Optimization for Robust MDP Jing Dong · Jingwei Li · Baoxiang Wang · Jingzhao Zhang |
||
Workshop
|
Fri 8:20 |
Online Policy Optimization for Robust MDP Jing Dong · Jingwei Li · Baoxiang Wang · Jingzhao Zhang |