Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

66 Results

<<   <   Page 5 of 6   >   >>
Poster
Fri 11:00 GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models
ZAITANG LI · Pin-Yu Chen · Tsung-Yi Ho
Workshop
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini · Adel Javanmard · Murat Erdogdu
Workshop
Robustness of Practical Perceptual Hashing Algorithms to Hash-Evasion and Hash-Inversion Attacks
Jordan Madden · Moxanki Bhavsar · Lhamo Dorje · Xiaohua Li
Workshop
Certifying Robustness via Topological Representations
Jens Agerberg · Andrea Guidolin · Andrea Martinelli · Pepijn Hoefgeest · David Eklund · Martina Scolamiero
Workshop
Sparse patches adversarial attacks via extrapolating point-wise information
Yaniv Nemcovsky · Avi Mendelson · Chaim Baskin
Workshop
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue
Workshop
Sun 11:05 Contributed Talk 3: LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue
Workshop
Shh, don't say that! Domain Certification in LLMs
Cornelius Emde · Preetham Arvind · Alasdair Paren · Maxime Kayser · Thomas Rainforth · Thomas Lukasiewicz · Philip Torr · Adel Bibi
Poster
Fri 11:00 Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
Yimeng Zhang · Xin Chen · Jinghan Jia · Yihua Zhang · Chongyu Fan · Jiancheng Liu · Mingyi Hong · Ke Ding · Sijia Liu
Workshop
Jailbreak Defense in a Narrow Domain: Failures of Existing Methods and Improving Transcript-Based Classifiers
Tony Wang · John Hughes · Henry Sleight · Rylan Schaeffer · Rajashree Agrawal · Fazl Barez · Mrinank Sharma · Jesse Mu · Nir Shavit · Ethan Perez
Workshop
Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers
Tony Wang · John Hughes · Henry Sleight · Rylan Schaeffer · Rajashree Agrawal · Fazl Barez · Mrinank Sharma · Jesse Mu · Nir Shavit · Ethan Perez
Poster
Thu 11:00 On the Scalability of Certified Adversarial Robustness with Generated Data
Thomas Altstidl · David Dobre · Arthur Kosmala · Bjoern Eskofier · Gauthier Gidel · Leo Schwinn