Affinity Workshop: Women in Machine Learning

Dual Channel Training of Large Action Spaces in Reinforcement Learning

Pranavi Pathakota · Hardik Meisheri · Harshad Khadilkar


The ability to learn robust policies while generalizing over large action spaces is an open challenge for intelligent systems, especially in noisy environments that face the curse of dimensionality. In this paper, we present a novel framework to efficiently learn action embeddings that simultaneously allow us to reconstruct the original action as well as to predict the expected future state. We use an encoder-decoder architecture for action embeddings with a dual channel loss that balances between action reconstruction and state prediction accuracy. The trained decoder is then utilized in conjunction with a standard reinforcement learning algorithm that produces actions in the embedding space. Our architecture is able to solve a 2D maze environment with up to 2^12 discrete noisy actions. Empirical results show that the model results in cleaner action embeddings, and the improved representations help learn better policies with earlier convergence.

Chat is not available.