Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

54 Results

<<   <   Page 4 of 5   >   >>
Workshop
Adversarial Watermarking for Face Recognition
Yuguang Yao · Anil Jain · Sijia Liu
Workshop
Does Refusal Training in LLMs Generalize to the Past Tense?
Maksym Andriushchenko · Nicolas Flammarion
Poster
Thu 11:00 DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms
Oryan Yehezkel · Alon Zolfi · Amit Baras · Yuval Elovici · Asaf Shabtai
Workshop
Plentiful Jailbreaks with String Compositions
Brian Huang
Poster
Fri 11:00 SuperDeepFool: a new fast and accurate minimal adversarial attack
alireza abdollahpour · Mahed Abroshan · Seyed-Mohsen Moosavi-Dezfooli
Workshop
Plentiful Jailbreaks with String Compositions
Brian Huang
Workshop
Decompose, Recompose, and Conquer: Multi-modal LLMs are Vulnerable to Compositional Adversarial Attacks in Multi-Image Queries
Julius Broomfield · George Ingebretsen · Reihaneh Iranmanesh · Sara Pieri · Ethan Kosak-Hine · Tom Gibbs · Reihaneh Rabbany · Kellin Pelrine
Workshop
Infecting LLM Agents via Generalizable Adversarial Attack
Weichen Yu · Kai Hu · Tianyu Pang · Chao Du · Min Lin · Matt Fredrikson
Workshop
Sun 16:50 Contributed Talk 6: Infecting LLM Agents via Generalizable Adversarial Attack
Weichen Yu · Kai Hu · Tianyu Pang · Chao Du · Min Lin · Matt Fredrikson
Workshop
Sun 11:05 Contributed Talk 3: LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue
Workshop
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue
Workshop
Sat 10:45 When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Rylan Schaeffer · Dan Valentine · Luke Bailey · James Chua · Cristobal Eyzaguirre · Zane Durante · Joe Benton · Brando Miranda · Henry Sleight · Tony Wang · John Hughes · Rajashree Agrawal · Mrinank Sharma · Scott Emmons · Sanmi Koyejo · Ethan Perez