ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models
Abstract
Interactive AI systems face critical reliability challenges as conversation length increases, with Large Language Models (LLMs) exhibiting significant performance degradation when deployed in extended multi-turn environments. This degradation, manifesting as reduced accuracy, decreased confidence, and a 112\% increase in response variability (unreliability), represents a fundamental robustness failure in interactive machine learning systems. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), a principled approach to maintaining system reliability and performance in interactive environments by monitoring internal uncertainty signals and triggering automated context consolidation when degradation is detected. ERGO uses Shannon entropy over next token probability distributions as a real-time indicator of system robustness, automatically restructuring interaction history when uncertainty spikes indicate potential failure modes. Evaluated across multiple LLMs in interactive task scenarios, ERGO improves average performance by 56.6\% over degraded multi-turn baselines, completely recovers the 15\% drop in peak performance reliability, and reduces response variability by 35.3\%. Our results demonstrate that entropy-based uncertainty monitoring provides an effective framework for building robust interactive ML systems that maintain consistent performance despite the inherent unreliability of accumulated and noisy conversational context.