firstbacksecondback
54 Results
Workshop
|
Adversarial Watermarking for Face Recognition Yuguang Yao · Anil Jain · Sijia Liu |
||
Workshop
|
Does Refusal Training in LLMs Generalize to the Past Tense? Maksym Andriushchenko · Nicolas Flammarion |
||
Poster
|
Thu 11:00 |
DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms Oryan Yehezkel · Alon Zolfi · Amit Baras · Yuval Elovici · Asaf Shabtai |
|
Workshop
|
Plentiful Jailbreaks with String Compositions Brian Huang |
||
Poster
|
Fri 11:00 |
SuperDeepFool: a new fast and accurate minimal adversarial attack alireza abdollahpour · Mahed Abroshan · Seyed-Mohsen Moosavi-Dezfooli |
|
Workshop
|
Plentiful Jailbreaks with String Compositions Brian Huang |
||
Workshop
|
Decompose, Recompose, and Conquer: Multi-modal LLMs are Vulnerable to Compositional Adversarial Attacks in Multi-Image Queries Julius Broomfield · George Ingebretsen · Reihaneh Iranmanesh · Sara Pieri · Ethan Kosak-Hine · Tom Gibbs · Reihaneh Rabbany · Kellin Pelrine |
||
Workshop
|
Infecting LLM Agents via Generalizable Adversarial Attack Weichen Yu · Kai Hu · Tianyu Pang · Chao Du · Min Lin · Matt Fredrikson |
||
Workshop
|
Sun 16:50 |
Contributed Talk 6: Infecting LLM Agents via Generalizable Adversarial Attack Weichen Yu · Kai Hu · Tianyu Pang · Chao Du · Min Lin · Matt Fredrikson |
|
Workshop
|
Sun 11:05 |
Contributed Talk 3: LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue |
|
Workshop
|
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue |
||
Workshop
|
Sat 10:45 |
When Do Universal Image Jailbreaks Transfer Between Vision-Language Models? Rylan Schaeffer · Dan Valentine · Luke Bailey · James Chua · Cristobal Eyzaguirre · Zane Durante · Joe Benton · Brando Miranda · Henry Sleight · Tony Wang · John Hughes · Rajashree Agrawal · Mrinank Sharma · Scott Emmons · Sanmi Koyejo · Ethan Perez |