Reinforcement Learning under Partial Observability

Workshop

Reinforcement Learning under Partial Observability

Joni Pajarinen · Chris Amato · Pascal Poupart · David Hsu

Room 517 C

Sat 8 Dec, 5 a.m. PST

[ Abstract ] Workshop Website

Reinforcement learning (RL) has succeeded in many challenging tasks such as Atari, Go, and Chess and even in high dimensional continuous domains such as robotics. Most impressive successes are in tasks where the agent observes the task features fully. However, in real world problems, the agent usually can only rely on partial observations. In real time games the agent makes only local observations; in robotics the agent has to cope with noisy sensors, occlusions, and unknown dynamics. Even more fundamentally, any agent without a full a priori world model or without full access to the system state, has to make decisions based on partial knowledge about the environment and its dynamics.

Reinforcement learning under partial observability has been tackled in the operations research, control, planning, and machine learning communities. One of the goals of the workshop is to bring researchers from different backgrounds together. Moreover, the workshop aims to highlight future applications. In addition to robotics where partial observability is a well known challenge, many diverse applications such as wireless networking, human-robot interaction and autonomous driving require taking partial observability into account.

Partial observability introduces unique challenges: the agent has to remember the past but also connect the present with potential futures requiring memory, exploration, and value propagation techniques that can handle partial observability. Current model-based methods can handle discrete values and take long term information gathering into account while model-free methods can handle high-dimensional continuous problems but often assume that the state space has been created for the problem at hand such that there is sufficient information for optimal decision making or just add memory to the policy without taking partial observability explicitly into account.

In this workshop, we want to go further and ask among others the following questions.
* How can we extend deep RL methods to robustly solve partially observable problems?
* Can we learn concise abstractions of history that are sufficient for high-quality decision-making?
* There have been several successes in decision making under partial observability despite the inherent challenges. Can we characterize problems where computing good policies is feasible?
* Since decision making is hard under partial observability do we want to use more complex models and solve them approximately or use (inaccurate) simple models and solve them exactly? Or not use models at all?
* How can we use control theory together with reinforcement learning to advance decision making under partial observability?
* Can we combine the strengths of model-based and model-free methods under partial observability?
* Can recent method improvements in general RL already tackle some partially observable applications which were not previously possible?
* How do we scale up reinforcement learning in multi-agent systems with partial observability?
* Do hierarchical models / temporal abstraction improve RL efficiency under partial observability?

Live content is unavailable. Log in and register to view live content

Timezone: America/Los_Angeles

Main Navigation

Workshop