Meta Dynamic Programming
Pierluca D'Oro
Abstract
To accelerate the pace at which they acquire new information, reinforcement learning algorithms can select which data to use first for training. In this paper, we outline a general methodology to perform this selection process, hinting at a generation of agents which deeply think about their current and future learning state while selecting their training data. In the context of prioritization methods for asynchronous dynamic programming, we propose a meta-level technique for state selection. We show that the method, called meta dynamic programming, together with its approximations, can provide promising performance improvements while being grounded on a theoretically sound metacognitive formalization.
Chat is not available.
Successful Page Load