Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Safe Generative AI

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

Neel Jain · Aditya Shrivastava · Chenyang Zhu · Daben Liu · Alfy Samuel · Ashwinee Panda · Anoop Kumar · Micah Goldblum · Tom Goldstein

Abstract

Chat is not available.