Towards Application Aligned Synthetic Surgical Image Synthesis
Danush Kumar Venkatesh · Stefanie Speidel
Abstract
The scarcity of annotated surgical data poses a significant challenge for developing deep learning systems in computer-assisted interventions. While diffusion models can synthesize realistic images, they often suffer from data memorization, resulting in inconsistent or non-diverse samples that may fail to improve, or even harm, downstream performance. We introduce \emph{Surgical Application-Aligned Diffusion} (SAADi), a new framework that aligns diffusion models with samples preferred by downstream models. Our method constructs pairs of \emph{preferred} and \emph{non-preferred} synthetic images and employs lightweight fine-tuning of diffusion models to align the generation process with downstream objectives explicitly. Experiments on three surgical datasets demonstrate consistent gains of $7$-$9$% in classification and $2$-$10$% in segmentation tasks, with the considerable improvements observed for underrepresented classes. Iterative refinement of synthetic samples further boosts performance by $4$-$10$%. Unlike baseline approaches, our method overcomes sample degradation and establishes task-aware alignment as a key principle for mitigating data scarcity and advancing surgical vision applications.
Chat is not available.
Successful Page Load