Timezone: »
Poster
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
Ronald Ortner · Daniil Ryabko
Thu Dec 06 02:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor
We derive sublinear regret bounds for undiscounted reinforcement learning in continuous state space. The proposed algorithm combines state aggregation with the use of upper confidence bounds for implementing optimism in the face of uncertainty. Beside the existence of an optimal policy which satisfies the Poisson equation, the only assumptions made are Hoelder continuity of rewards and transition probabilities.
Author Information
Ronald Ortner (Montanuniversitaet Leoben)
Daniil Ryabko (INRIA)
More from the Same Authors
-
2019 Poster: Regret Bounds for Learning State Representations in Reinforcement Learning »
Ronald Ortner · Matteo Pirotta · Alessandro Lazaric · Ronan Fruit · Odalric-Ambrym Maillard -
2017 Poster: Independence clustering (without a matrix) »
Daniil Ryabko -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2012 Poster: Reducing statistical time-series problems to binary classification »
Daniil Ryabko · Jeremie Mary -
2012 Poster: Locating Changes in Highly Dependent Data with Unknown Number of Change Points »
Azadeh Khaleghi · Daniil Ryabko -
2012 Spotlight: Locating Changes in Highly Dependent Data with Unknown Number of Change Points »
Azadeh Khaleghi · Daniil Ryabko -
2011 Poster: PAC-Bayesian Analysis of Contextual Bandits »
Yevgeny Seldin · Peter Auer · Francois Laviolette · John Shawe-Taylor · Ronald Ortner -
2011 Poster: Selecting the State-Representation in Reinforcement Learning »
Odalric-Ambrym Maillard · Remi Munos · Daniil Ryabko