Through the telecom lens: Are all training samples important?
Abstract
The rise of AI in telecommunications, from optimizing Radio Access Networks to managing user experience, has sharply increased data volumes and training demands. Telecom data is often noisy, high-dimensional, costly to store, process, and label. Despite AI’s critical role, standard workflows still assume all training samples contribute equally. On the other hand next-generation systems require AI models that are accurate, efficient, and sustainable. This paper questions the equal-importance assumption by analyzing individual sample roles in telecom training and assessing whether the proposed method optimizes computing and energy use. We conduct sample-level gradient analysis across epochs to identify patterns of influence and redundancy in model learning. Based on this, we propose a sample-importance framework that selectively prioritizes impactful data and reduce computational without compromising accuracy. Experiments on real-world telecom datasets show our method preserves performance while reducing data needs and computational overhead while advancing the goals of sustainable AI in telecommunications.