Timezone: »

Toward Semantic History Compression for Reinforcement Learning
Fabian Paischer · Thomas Adler · Andreas Radler · Markus Hofmarcher · Sepp Hochreiter
Event URL: https://openreview.net/forum?id=97C6klf5shp »

Agents interacting under partial observability require access to past observations via a memory mechanism in order to approximate the true state of the environment.Recent work suggests that leveraging language as abstraction provides benefits for creating a representation of past events.History Compression via Language Models (HELM) leverages a pretrained Language Model (LM) for representing the past. It relies on a randomized attention mechanism to translate environment observations to token embeddings.In this work, we show that the representations resulting from this attention mechanism can collapse under certain conditions. This causes blindness of the agent to certain subtleties in the environment. We propose a solution to this problem consisting of two parts. First, we improve upon HELM by substituting the attention mechanism with a feature-wise centering-and-scaling operation. Second, we take a step toward semantic history compression by encoding the observations with a pretrained multimodal model such as CLIP, which further improves performance. With these improvements our model is able to solve the challenging MiniGrid-Memory environment.Surprisingly, however, our experiments suggest that this is not due to the semantic enrichment of the representation presented to the LM but only due to the discriminative power provided by CLIP.

Author Information

Fabian Paischer (ELLIS Unit / University Linz)
Thomas Adler (ELLIS Unit / University Linz)
Andreas Radler (Johannes Kepler Universität Linz)
Markus Hofmarcher (ELLIS Unit / University Linz)
Sepp Hochreiter (ELLIS Unit / University Linz)

More from the Same Authors