Skip to yearly menu bar Skip to main content


Poster

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Leo Schwinn · David Dobre · Sophie Xhonneux · Gauthier Gidel · Stephan Günnemann
2024 Poster

Abstract

Video

Chat is not available.