Skip to yearly menu bar Skip to main content

Workshop: Foundation Models for Decision Making

Natural Language-based State Representation in Deep Reinforcement Learning

Masudur Rahman · Yexiang Xue

[ ] [ Project Page ]
presentation: Foundation Models for Decision Making
Fri 15 Dec 6:15 a.m. PST — 3:30 p.m. PST


This study investigates the potential of using natural language descriptions as an alternative to direct image-based observations for learning policies in reinforcement learning. Due to the inherent challenges in managing image-based observations, which include abundant information and irrelevant features, we propose a method that compresses images into a natural language form for state representation. This approach allows better interpretability and leverages the processing capabilities of large language models (LLMs). We conducted several experiments involving tasks that required image-based observation. The results demonstrated that policies trained using natural language descriptions of images yield better generalization than those trained directly from images, emphasizing the potential of this approach in practical settings.

Chat is not available.