Timezone: »
Although autoregressive models have achieved promising results on image generation, their unidirectional generation process prevents the resultant images from fully reflecting global contexts. To address the issue, we propose an effective image generation framework of \emph{Draft-and-Revise} with \emph{Contextual RQ-transformer} to consider global contexts during the generation process. As a generalized VQ-VAE, RQ-VAE first represents a high-resolution image as a sequence of discrete code stacks. After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image. Then, we propose the two-phase decoding, Draft-and-Revise, for Contextual RQ-Transformer to generates an image, while fully exploiting the global contexts of the image during the generation process. Specifically. in the \emph{draft} phase, our model first focuses on generating diverse images despite rather low quality. Then, in the \emph{revise} phase, the model iteratively improves the quality of images, while preserving the global contexts of generated images. In experiments, our method achieves state-of-the-art results on conditional image generation. We also validate that the Draft-and-Revise decoding can achieve high performance by effectively controlling the quality-diversity trade-off in image generation.
Author Information
Doyup Lee (Kakao Brain)
Chiheon Kim (Kakao Brain)
Saehoon Kim (Kakao Brain)
Minsu Cho (POSTECH)
WOOK SHIN HAN (POSTECH)
More from the Same Authors
-
2020 : Combinatorial 3D Shape Generation via Sequential Assembly »
Jungtaek Kim · Hyunsoo Chung · Jinhwi Lee · Minsu Cho · Jaesik Park -
2022 : SeLCA: Self-Supervised Learning of Canonical Axis »
Seungwook Kim · Yoonwoo Jeong · Chunghyun Park · Jaesik Park · Minsu Cho -
2022 Poster: Locally Hierarchical Auto-Regressive Modeling for Image Generation »
Tackgeun You · Saehoon Kim · Chiheon Kim · Doyup Lee · Bohyung Han -
2022 Poster: PeRFception: Perception using Radiance Fields »
Yoonwoo Jeong · Seungjoo Shin · Junha Lee · Chris Choy · Anima Anandkumar · Minsu Cho · Jaesik Park -
2022 Poster: Peripheral Vision Transformer »
Juhong Min · Yucheng Zhao · Chong Luo · Minsu Cho -
2021 Poster: Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning »
Hyunsoo Chung · Jungtaek Kim · Boris Knyazev · Jinhwi Lee · Graham Taylor · Jaesik Park · Minsu Cho -
2021 Poster: Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training »
Minguk Kang · Woohyeon Shim · Minsu Cho · Jaesik Park -
2021 Poster: Relational Self-Attention: What's Missing in Attention for Video Understanding »
Manjin Kim · Heeseung Kwon · CHUNYU WANG · Suha Kwak · Minsu Cho -
2020 Poster: CircleGAN: Generative Adversarial Learning across Spherical Circles »
Woohyeon Shim · Minsu Cho -
2019 Poster: Mining GOLD Samples for Conditional GANs »
Sangwoo Mo · Chiheon Kim · Sungwoong Kim · Minsu Cho · Jinwoo Shin -
2019 Poster: Fast AutoAugment »
Sungbin Lim · Ildoo Kim · Taesup Kim · Chiheon Kim · Sungwoong Kim -
2018 Poster: Uncertainty-Aware Attention for Reliable Interpretation and Prediction »
Jay Heo · Hae Beom Lee · Saehoon Kim · Juho Lee · Kwang Joon Kim · Eunho Yang · Sung Ju Hwang -
2018 Poster: DropMax: Adaptive Variational Softmax »
Hae Beom Lee · Juho Lee · Saehoon Kim · Eunho Yang · Sung Ju Hwang -
2017 : Learning to Transfer Initializations for Bayesian Hyperparameter Optimization »
Saehoon Kim