Skip to yearly menu bar Skip to main content

Workshop: Synthetic Data Generation with Generative AI

Diffusion-based Semantic-Discrepant Outlier Generation for Out-of-Distribution Detection

Suhee Yoon · Sanghyu Yoon · Hankook Lee · Sangjun Han · Ye Seul Sim · Kyungeun Lee · Hyeseung Cho · Woohyung Lim

Keywords: [ Outlier generation ] [ Out-of-distribution Detection ] [ Diffusion model ]


Out-of-distribution (OOD) detection, which determines whether a given sample is part of the training distribution, has recently shown promising results by training with synthetic OOD datasets. The important properties for effective synthetic OOD datasets are two-fold: (i) the OOD sample should be close to in-distribution (ID), but (ii) represents semantic-wise shifted information. To achieve this, we introduce a novel framework that consists of Semantic-Discrepant (SD) Outlier generation and an advanced OOD detection method. For SD outlier generation, we utilize a conditional diffusion model trained with pseudo-labels. Then, we propose a simple yet effective method, semantic-discrepant guidance, allowing model to generate realistic outliers that contain incoherent semantic shift while preserving nuisance information (e.g., background). Furthermore, we suggest SD outlier-aware OOD detector training and scoring methods. Our experiments demonstrate the effectiveness of our framework on CIFAR-10 dataset. We achieve AUROC of 98% when CIFAR-100 are given as OOD. The SD outlier dataset on CIFAR-10 is available at

Chat is not available.