Timezone: »

Task-Independent Causal State Abstraction
Zizhao Wang · Xuesu Xiao · Yuke Zhu · Peter Stone

Learning dynamics models accurately and learning policies sample-efficiently are two important challenges for Model-Based Reinforcement Learning (MBRL). Regarding dynamics accuracy, in contrast to the sparse dynamics exhibited in many real world environments, most MBRL methods learn a dense dynamics model which is vulnerable to spurious correlations and therefore generalizes poorly to unseen states. Meanwhile, existing state abstractions can improve sample efficiency, but their dependence on specific reward functions constrains their applications to limited tasks. In this paper, we introduce an alternative state abstraction called Task-Independent Causal State Abstraction (TICSA). Exploiting sparsity exhibited in the real world, the proposed method first learns a causal dynamics model that generalizes to unexplored states. A state abstraction can then be derived from the learned dynamics, which not only improves sample efficiency but also applies to many tasks. Using a simulated manipulation environment and two different tasks, we observe that both the dynamics model and policies learned by TICSA generalize well to unseen states and that learning with TICSA also improves sample efficiency.

Author Information

Zizhao Wang (University of Texas at Austin)
Xuesu Xiao (The University of Texas at Austin)
Yuke Zhu (University of Texas - Austin)
Peter Stone (The University of Texas at Austin, Sony AI)

More from the Same Authors