Selfish Evolution: Making Discoveries in Extreme Label Noise with the Help of Overfitting Dynamics
Abstract
Motivated by the scarcity of proper labels in astrophysical surveys, we introduce Selfish Evolution, a weakly supervised technique that detects and corrects corrupted labels in situ. Rather than relying on early stopping, we first train on the noisy dataset and then deliberately overfit to individual samples; the ensuing overfitting dynamics form spatiotemporal “evolution cubes” that are predictive of both label noisiness and its corrected value. A secondary network learns this mapping to produce pixel-level label repairs. The procedure runs in a closed loop—cleaned labels improve the detector, which then reveals fainter events—without assumptions about the model state at intervention. Centered on supernova detection, we demonstrate convergence toward a largely clean training set and recovery of missed astrophysical objects, advancing discovery under extreme label noise.