Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Deep Generative Models and Downstream Applications

Accurate Imputation and Efficient Data Acquisitionwith Transformer-based VAEs

Sarah Lewis · Tatiana Matejovicova · Yingzhen Li · Angus Lamb · Yordan Zaykov · Miltiadis Allamanis · Cheng Zhang


Abstract:

Predicting missing values in tabular data, with uncertainty, is an essential task by itself as well as for downstream tasks such as personalized data acquisition. It is not clear whether state-of-the-art deep generative models for these tasks are well equipped to model the complex relationships that may exist between different features, especially when the subset of observed data are treated as a set. In this work we propose new attention-based models for estimating the joint conditional distribution of randomly missing values in mixed-type tabular data. The models improve on the state-of-the-art Partial Variational Autoencoder (Ma et al. 2019) on a range of imputation and information acquisition tasks.

Chat is not available.