Skip to yearly menu bar Skip to main content


Cannot or Should Not? Automatic Analysis of Refusal Composition in IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs

Alexander von Recum · Christoph Schnabl · Gabor Hollbeck · Marvin von Hagen · Silas Alberti · Philip Blinde

Abstract

Chat is not available.