NeurIPS 2020 : Meta-trained agents implement Bayes-optimal agents



Meta-trained agents implement Bayes-optimal agents

Vlad Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro Ortega

Spotlight presentation: Orals & Spotlights Track 16: Continual/Meta/Misc Learning
on Wed, Dec 9th, 2020 @ 15:00 – 15:10 GMT

Poster Session 4 (more posters)
on Wed, Dec 9th, 2020 @ 17:00 – 19:00 GMT
GatherTown: Applications ( Town B2 - Spot C0 )

Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.

Toggle Abstract Paper (in Proceedings / .pdf)

Abstract: Memory-based meta-learning is a powerful technique to build agents that adapt fast to any task within a target distribution. A previous theoretical study has argued that this remarkable performance is because the meta-training protocol incentivises agents to behave Bayes-optimally. We empirically investigate this claim on a number of prediction and bandit tasks. Inspired by ideas from theoretical computer science, we show that meta-learned and Bayes-optimal agents not only behave alike, but they even share a similar computational structure, in the sense that one agent system can approximately simulate the other. Furthermore, we show that Bayes-optimal agents are fixed points of the meta-learning dynamics. Our results suggest that memory-based meta-learning is a general technique for numerically approximating Bayes-optimal agents; that is, even for task distributions for which we currently don't possess tractable models.

Meta-trained agents implement Bayes-optimal agents

Vlad Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro Ortega

Preview Video and Chat

Chat is not available.