Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

55 Results

<<   <   Page 5 of 5   >>   >
Workshop
Retention Score: Quantifying Jailbreak Risks for Vision Language Models
ZAITANG LI · Pin-Yu Chen · Tsung-Yi Ho
Workshop
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models
Xiaomeng Hu · Pin-Yu Chen · Tsung-Yi Ho
Workshop
Zer0-Jack: A memory-efficient gradient-based jailbreaking method for black box Multi-modal Large Language Models
Tiejin Chen · Kaishen Wang · Hua Wei
Workshop
DeepInception: Hypnotize Large Language Model to Be Jailbreaker
Xuan Li · Zhanke Zhou · Jianing Zhu · Jiangchao Yao · Tongliang Liu · Bo Han
Workshop
LLM Improvement for Jailbreak Defense: Analysis Through the Lens of Over-Refusal
Swetasudha Panda · Naveen Jafer Nizar · Michael Wick
Workshop
Testing the Limits of Jailbreaking with the Purple Problem
Taeyoun Kim · Suhas Kotha · Aditi Raghunathan
Workshop
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
Yifan Zeng · Yiran Wu · Xiao Zhang · Huazheng Wang · Qingyun Wu