Poster
Exploring DCN-like architecture for fast image generation with arbitrary resolution
Shuai Wang · Zexian Li · Tianhui Song · Xubin Li · Tiezheng Ge · Bo Zheng · Limin Wang
East Exhibit Hall A-C #2401
Abstract:
Arbitrary-resolution image generation still remains a challenging task in AIGC, as it requires handling varying resolutions and aspect ratios while maintaining high visual quality. Existing transformer-based diffusion methods suffer from quadratic computation cost and limited resolution extrapolation capabilities, making them less effective for this task. In this paper, we propose FlowDCN, a purely convolution-based generative model with linear time and memory complexity, that can efficiently generate high-quality images at arbitrary resolutions. Equipped with a new design of learnable group-wise deformable convolution block, our FlowDCN yields higher flexibility and capability to handle different resolutions with a single model.FlowDCN achieves the state-of-the-art 4.30 sFID on ImageNet Benchmark and comparable resolution extrapolation results, surpassing transformer-based counterparts in terms of convergence speed (only images), visual quality, parameters ( reduction) and FLOPs ( reduction). We believe FlowDCN offers a promising solution to scalable and flexible image synthesis.
Chat is not available.