Real-Time Anomaly Detection System for Data Quality Monitoring in the CMS ECAL using Autoencoders
Abstract
Large-scale experiments such as the CMS detector, a general-purpose apparatus that records high-energy collisions at the CERN LHC, involve complex detector components and operate under changing conditions, making continuous monitoring essential to ensure physics-quality data. In the CMS electromagnetic calorimeter, traditional cut-based data quality monitoring (DQM) addresses known issues but has limited ability to detect rare or unforeseen anomalies. We present a semi-supervised anomaly detection framework based on autoencoders, developed for and deployed in the CMS online DQM workflow during the LHC Run 3 (2022-2026). The method improves performance by incorporating both the temporal evolution of anomalies and spatial variations of the detector response. Deployment results from Run 3 are presented, showing that the system detected issues missed by the existing monitoring and provided early indications of degrading components. The framework enables real-time anomaly detection with a very low false-alarm rate, representing one of the first operational uses of deep learning for DQM in high-energy physics and offering a generalizable approach for other scientific experiments.