Timezone: »

Generalisation in Lifelong Reinforcement Learning through Logical Composition
Geraud Nangue Tasse · Steven James · Benjamin Rosman
Event URL: https://openreview.net/forum?id=kO-Cgmasm7 »

We leverage logical composition in reinforcement learning to create a framework that enables an agent to autonomously determine whether a new task can be immediately solved using its existing abilities, or whether a task-specific skill should be learned. In the latter case, the proposed algorithm also enables the agent to learn the new task faster by generating an estimate of the optimal policy. Importantly, we provide two main theoretical results: we give bounds on the performance of the transferred policy on a new task, and we give bounds on the necessary and sufficient number of tasks that need to be learned throughout an agent's lifetime to generalise over a distribution. We verify our approach in a series of experiments, where we perform transfer learning both after learning a set of base tasks, and after learning an arbitrary set of tasks. We also demonstrate that as a side effect of our transfer learning approach, an agent can produce an interpretable Boolean expression of its understanding of the current task. Finally, we demonstrate our approach in the full lifelong setting where an agent receives tasks from an unknown distribution and, starting from zero skills, is able to quickly generalise over the task distribution after learning only a few tasks---which are sub-logarithmic in the size of the task space.

Author Information

Geraud Nangue Tasse (University of the Witwatersrand)
Steven James (University of the Witwatersrand)
Benjamin Rosman (University of the Witwatersrand)

More from the Same Authors