Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization
Abstract
Recent progress in offline reinforcement learning (RL) has shown that it is often possible to train strong agents without potentially unsafe or impractical online interaction. However, in real-world settings, agents may encounter unseen environments with different dynamics, and generalization ability is required. This work presents Dynamics-Augmented Decision Transformer (DADT), a simple yet efficient method to train generalizable agents from offline datasets; on top of return-conditioned policy using the transformer architecture, we improve generalization capabilities by using representation learning based on next state prediction. Our experimental results demonstrate that DADT outperforms prior state-of-the-art methods for offline dynamics generalization. Intriguingly, DADT without fine-tuning even outperforms fine-tuned baselines.