Skip to yearly menu bar Skip to main content

Workshop: Causal Representation Learning

Exploiting Causal Representations in Reinforcement Learning: A Posterior Sampling Approach

Mirco Mutti · Riccardo De Santi · Marcello Restelli · Alexander Marx · Giorgia Ramponi

Keywords: [ Causal Reinforcement Learning ] [ Posterior Sampling ]


Posterior sampling allows the exploitation of prior knowledge of the environment's transition dynamics to improve the sample efficiency of reinforcement learning. The prior is typically specified as a class of parametric distributions, a task that can be cumbersome in practice, often resulting in the choice of uninformative priors. Instead, in this work we study how to exploit causal representations to build priors that are often more natural to design. Specifically, we propose a novel hierarchical posterior sampling approach, called C-PSRL, in which the prior is given as a (partial) causal graph over the environment's causal variables, such as listing known causal dependencies between biometric features in a medical treatment study. C-PSRL simultaneously learns a graph consistent with the true causal graph at the higher level and the parameters of the resulting factored dynamics at the lower level. For this procedure, we provide an analysis of its Bayesian regret, which explicitly connects the regret rate with the degree of causal knowledge, and we show how regret minimization leads to a weak notion of causal discovery.

Chat is not available.