NeurIPS Poster UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Poster

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Haozhe Zhao · Xiaojian (Shawn) Ma · Liang Chen · Shuzheng Si · Rujie Wu · Kaikai An · Peiyu Yu · Minjia Zhang · Qing Li · Baobao Chang

West Ballroom A-D #5107

[ Abstract ] [ Project Page ]

[ Paper] [ Slides] [ Poster]

Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

This paper presents UltraEdit, a large-scale (~ 4M editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples: 1) UltraEdit includes more diverse editing instructions by combining LLM creativity and in-context editing examples by human raters; 2) UltraEdit is anchored on real images (photographs or artworks), which offers more diversity and less biases than those purely synthesized by text-to-image models; 3) UltraEdit supports region-based editing with high-quality, automatically produced region annotations. Our experiments show that canonical diffusion-based editing baselines trained on UltraEdit set new records on challenging MagicBrush and Emu-Edit benchmarks, respectively. Our analysis further confirms the crucial role of real image anchors and region-based editing data. The dataset, code, and models will be made public.

Chat is not available.