Timezone: »

 
Poster
Belief-Dependent Macro-Action Discovery in POMDPs using the Value of Information
Genevieve Flaspohler · Nicholas Roy · John Fisher III

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #503

This work introduces macro-action discovery using value-of-information (VoI) for robust and efficient planning in partially observable Markov decision processes (POMDPs). POMDPs are a powerful framework for planning under uncertainty. Previous approaches have used high-level macro-actions within POMDP policies to reduce planning complexity. However, macro-action design is often heuristic and rarely comes with performance guarantees. Here, we present a method for extracting belief-dependent, variable-length macro-actions directly from a low-level POMDP model. We construct macro-actions by chaining sequences of open-loop actions together when the task-specific value of information (VoI) --- the change in expected task performance caused by observations in the current planning iteration --- is low. Importantly, we provide performance guarantees on the resulting VoI macro-action policies in the form of bounded regret relative to the optimal policy. In simulated tracking experiments, we achieve higher reward than both closed-loop and hand-coded macro-action baselines, selectively using VoI macro-actions to reduce planning complexity while maintaining near-optimal task performance.

Author Information

Genevieve Flaspohler (Massachusetts Institute of Technology)
Nicholas Roy (MIT)
John Fisher III (MIT)

More from the Same Authors