Poster
in
Affinity Workshop: Black in AI
Imitation from Observation With Bootstrapped Contrastive Learning
Medric Sonwa · Johanna Hansen · Eugene Belilovsky
Keywords: [ Computer Vision ]
Imitation from observation is a paradigm that consists of training agents using visual observations of expert demonstrations without direct access to the actions.One of the most common procedures adopted to solve this problem is to train a reward function from the demonstrations, but this task still remains a significant challenge.We approach this problem with a method of agent behavior representation in a latent space using demonstration videos.Our approach exploits recent algorithms of contrastive learning of image and video and uses a bootstrapping method to progressively train a trajectory encoding function with respect to the variation of the agent policy. This function is then used to compute the rewards provided to a standard Reinforcement Learning (RL) algorithm.Our method uses only a limited number of videos produced by an expert and we do not have access to the expert policy function.Our experiments show promising results on a set of continuous control tasks and demonstrate that learning a behavior encoder from videos allows for building an efficient reward function for the agent.