Position: There Is No Ground Truth -- Rethinking Evaluation in AI-Driven Channel Prediction
Abstract
Machine learning (ML) has rapidly gained traction for wireless channel state information (CSI) prediction, promising improved reliability and reduced overhead for 5G/6G systems. From autoencoder-based CSI compression to large language model-based adaptations today, a plethora of techniques report impressive accuracy in forecasting channel dynamics. However, this work argues that many of these results are built on flawed evaluation practices. In particular, current works often assume an idealized “ground truth” provided by synthetic channel models, and thereby overlook key issues: (1) training–test leakage when the same generative simulator underpins both training and evaluation; (2) reliance on synthetic datasets without field validation; and (3) conflating memorization with true generalization. The consequences are inflated performance metrics that may not transfer to operational networks. As a result, there is growing concern that most current works are “overfitting” to the simulation sandboxes – optimizing for a non-existent ground truth rather than solving the real channel prediction problem. We also chart a three-pronged constructive path with concrete guidelines for benchmark design, dataset standards, and evaluation protocols.