Diffusion Autoencoders with Perceivers for Long, Irregular and Multimodal Astronomical Sequences
Yunyi Shen · Alex Gagliano
Abstract
We introduce a perceiver-based diffusion autoencoder (daep) for learning bottleneck representations from irregularly sampled long sequences across multiple modalities. Our approach encodes each modality with a perceiver encoder, optionally applies late fusion for multimodal data, and decodes with modality-specific diffusion decoders. Unlike standard $\beta$-VAEs, daep can produce more regular latent spaces and capture finer details in reconstructions, as demonstrated on high-resolution variable star spectra, ZTF supernova light curves and spectra, and Galaxy10 images. We further show that the model supports cross-modality generation and handles missing modalities through modality dropping. These results highlight daep’s potential for downstream tasks such as classification, clustering, and generative modeling of complex scientific data.
Chat is not available.
Successful Page Load