Timezone: »
In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken' the
powerful decoder' by applying uniformly random dropout to the decoder input.We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.
Author Information
Djordje Miladinovic (ETH Zurich)
Kumar Shridhar (Swiss Federal Institute of Technology)
Kushal Jain (University of California, San Diego)
Max Paulus (ETH Zurich)
Joachim M Buhmann (ETH Zurich)
Carl Allen (ETH Zurich)
- Postdoc in the AI Centre, ETH Zurich. - PhD from University of Edinburgh, 2021 (Profs Tim Hospedales, Iain Murray) - PhD focus: a theoretical understanding of how words and relations can be represented. - Broad interest: mathematical understanding of machine learning methods - Current focus: representation and latent variable models, eg VAEs, Contrastive Learning.
More from the Same Authors
-
2022 Poster: Learning Long-Term Crop Management Strategies with CyclesGym »
Matteo Turchetta · Luca Corinzia · Scott Sussex · Amanda Burton · Juan Herrera · Ioannis Athanasiadis · Joachim M Buhmann · Andreas Krause -
2020 Poster: Gradient Estimation with Stochastic Softmax Tricks »
Max Paulus · Dami Choi · Danny Tarlow · Andreas Krause · Chris Maddison -
2020 Oral: Gradient Estimation with Stochastic Softmax Tricks »
Max Paulus · Dami Choi · Danny Tarlow · Andreas Krause · Chris Maddison -
2019 : Disentanglement Challenge - Disentanglement and Results of the Challenge Stages 1 & 2 »
Djordje Miladinovic · Stefan Bauer · Daniel Keysers -
2019 Poster: What the Vec? Towards Probabilistically Grounded Embeddings »
Carl Allen · Ivana Balazevic · Timothy Hospedales -
2019 Poster: Multi-relational Poincaré Graph Embeddings »
Ivana Balazevic · Carl Allen · Timothy Hospedales -
2017 Poster: Efficient and Flexible Inference for Stochastic Systems »
Stefan Bauer · Nico S Gorbach · Djordje Miladinovic · Joachim M Buhmann -
2017 Poster: Non-monotone Continuous DR-submodular Maximization: Structure and Algorithms »
Yatao Bian · Kfir Levy · Andreas Krause · Joachim M Buhmann -
2017 Poster: Scalable Variational Inference for Dynamical Systems »
Nico S Gorbach · Stefan Bauer · Joachim M Buhmann -
2016 Poster: Scalable Adaptive Stochastic Optimization Using Random Projections »
Gabriel Krummenacher · Brian McWilliams · Yannic Kilcher · Joachim M Buhmann · Nicolai Meinshausen -
2014 Poster: Fast and Robust Least Squares Estimation in Corrupted Linear Models »
Brian McWilliams · Gabriel Krummenacher · Mario Lucic · Joachim M Buhmann -
2014 Spotlight: Fast and Robust Least Squares Estimation in Corrupted Linear Models »
Brian McWilliams · Gabriel Krummenacher · Mario Lucic · Joachim M Buhmann -
2013 Poster: Correlated random features for fast semi-supervised learning »
Brian McWilliams · David Balduzzi · Joachim M Buhmann -
2011 Workshop: Philosophy and Machine Learning »
Marcello Pelillo · Joachim M Buhmann · Tiberio Caetano · Bernhard Schölkopf · Larry Wasserman -
2006 Poster: Denoising and Dimension Reduction in Feature Space »
Mikio L Braun · Joachim M Buhmann · Klaus-Robert Müller