Workshop: Synthetic Data for Empowering ML Research

HandsOff: Labeled Dataset Generation with No Additional Human Annotations

Austin Xu · Mariya Vasileva · Arjun Seshadri


Because of their success in producing realistic images, generative adversarial networks (GANs) have recently been leveraged to generate labeled synthetic datasets. However, existing dataset generation methods do not sufficiently leverage existing images with high quality labels, which often limits either the practicality of the system or the complexity of generated labels. We propose the HandsOff framework, which is capable of producing an unlimited number of synthetic images and corresponding labels after being trained on a small of number of pre-existing labeled images. Our framework avoids the practical drawbacks of similar frameworks while retaining the ability to generate rich, pixel-wise labels, such as segmentation masks. This capability is achieved by unifying the field of GAN inversion with synthetic dataset generation, providing a new application for GAN inversion techniques. We demonstrate the efficacy of our framework on semantic segmentation tasks by generating labeled image datasets, and training and evaluating the performance of a downstream task. We also qualitatively assess the performance of GAN inversion techniques used in our framework. Finally, we explore directions for making the framework more lightweight from a computational resource perspective.

Chat is not available.