Skip to yearly menu bar Skip to main content


Poster

I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing

Yiwei Ma · Jiayi Ji · Ke Ye · Weihuang Lin · Zhibin Wang · Yonghan Zheng · Qiang Zhou · Xiaoshuai Sun · Rongrong Ji

[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Significant progress has been made in the field of Instruction-based Image Editing (IIE). However, evaluating these models poses a significant challenge. A crucial requirement in this field is the establishment of a comprehensive evaluation benchmark for accurately assessing editing results and providing valuable insights for its further development. In response to this need, we propose I2EBench, a comprehensive benchmark designed to automatically evaluate the quality of edited images produced by IIE models from multiple dimensions. I2EBench consists of 2000+ images for editing, along with corresponding original and diverse instructions. It offers three distinctive characteristics: 1) Comprehensive Evaluation Dimensions: I2EBench comprises 16 evaluation dimensions that cover both high-level and low-level aspects, providing a comprehensive assessment of each IIE model. 2) Human Perception Alignment: To ensure the alignment of our benchmark with human perception, we conducted an extensive user study for each evaluation dimension. 3) Valuable Research Insights: By analyzing the advantages and disadvantages of existing IIE models across the 16 dimensions, we offer valuable research insights to guide future development in the field. We will open-source I2EBench, including all instructions, input images, human annotations, edited images from all evaluated methods, and a simple script for evaluating the results from new IIE models.

Live content is unavailable. Log in and register to view live content