Timezone: »

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Djordje Miladinovic · Kumar Shridhar · Kushal Jain · Max Paulus · Joachim M Buhmann · Carl Allen

Wed Nov 30 09:00 AM -- 11:00 AM (PST) @ Hall J #439

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken' thepowerful decoder' by applying uniformly random dropout to the decoder input.We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.

Author Information

Djordje Miladinovic (ETH Zurich)
Kumar Shridhar (Swiss Federal Institute of Technology)
Kushal Jain (University of California, San Diego)
Max Paulus (ETH Zurich)
Joachim M Buhmann (ETH Zurich)
Carl Allen (ETH Zurich)

- Postdoc in the AI Centre, ETH Zurich. - PhD from University of Edinburgh, 2021 (Profs Tim Hospedales, Iain Murray) - PhD focus: a theoretical understanding of how words and relations can be represented. - Broad interest: mathematical understanding of machine learning methods - Current focus: representation and latent variable models, eg VAEs, Contrastive Learning.

More from the Same Authors