Timezone: »

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress
Manuel Lopes · Tobias Lang · Marc Toussaint · Pierre-Yves Oudeyer

Thu Dec 06 02:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor

Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as Rmax base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a ``sanity check'' theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.

Author Information

Manuel Lopes (INRIA)
Tobias Lang
Marc Toussaint (TU Berlin)
Pierre-Yves Oudeyer (INRIA)

More from the Same Authors