Workshop: Machine Learning and the Physical Sciences

Interpretable Encoding of Galaxy Spectra

Yan Liang · Peter Melchior · Sicong Lu


We present a novel loss function to train autoencoder models for galaxy spectra. Our architecture reliably captures intrinsic spectral features regardless of redshift, providing highly realistic reconstructions for SDSS galaxy spectra using as little as two latent parameters. But the interpretation of encoded parameters remains difficult because the decoding process is non-linear and the latent space can be highly degenerate: different latent positions can map to virtually indistinguishable spectra.To resolve this encoding ambiguity, we introduce a new similarity loss, which explicitly links latent-space distances to data-space distances. Minimizing the similarity loss together with the common fidelity loss leads to non-degenerate, highly accurate spectrum models that generalize over variations in noise, masking, and redshift, while providing a latent space distribution with clear separations between common and anomalous data.

Chat is not available.