Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Safe Generative AI

INTERPRETABILITY OF LLM DECEPTION: UNIVERSAL MOTIF

Wannan Yang ⋅ Chen Sun ⋅ Gyorgy Buzsaki

Abstract

Chat is not available.