Poster
in
Workshop: Deep Reinforcement Learning
Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks
Ryan Sander · Wilko Schwarting · Tim Seyde · Igor Gilitschenski · Sertac Karaman · Daniela Rus
The human brain is remarkably sample efficient, capable of learning complex behaviors by meaningfully combining previous experiences to simulate novel ones, even when few experiences are available. To improve sample efficiency in continuous control tasks, we take inspiration from this learning phenomenon. We propose Neighborhood Mixup Experience Replay (NMER), a modular replay buffer that interpolates transitions with their closest neighbors in normalized state-action space. NMER preserves a locally linear approximation of the transition manifold by only interpolating transitions with similar state-action features. Under NMER, a given transition’s set of state-action neighbors is dynamic and episode agnostic, in turn encouraging greater policy generalizability via cross-episode interpolation. We combine our approach with recent off-policy reinforcement learning algorithms and evaluate on several continuous control environments. We observe that NMER improves sample efficiency over other state-of-the-art replay buffers, enabling agents to effectively recombine previous experience and learn from limited data.