firstbacksecondback
210 Results
Poster
|
Wed 11:00 |
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses Xiaosen Zheng · Tianyu Pang · Chao Du · Qian Liu · Jing Jiang · Min Lin |
|
Poster
|
Wed 16:30 |
UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models Yihua Zhang · Chongyu Fan · Yimeng Zhang · Yuguang Yao · Jinghan Jia · Jiancheng Liu · Gaoyuan Zhang · Gaowen Liu · Ramana Kompella · Xiaoming Liu · Sijia Liu |
|
Poster
|
Wed 11:00 |
The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection Qingyang Zhang · Qiuxuan Feng · Joey Tianyi Zhou · Yatao Bian · Qinghua Hu · Changqing Zhang |
|
Poster
|
Wed 11:00 |
Federated Model Heterogeneous Matryoshka Representation Learning Liping Yi · Han Yu · Chao Ren · Gang Wang · xiaoguang Liu · Xiaoxiao Li |
|
Poster
|
Fri 11:00 |
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models Patrick Chao · Edoardo Debenedetti · Alexander Robey · Maksym Andriushchenko · Francesco Croce · Vikash Sehwag · Edgar Dobriban · Nicolas Flammarion · George J. Pappas · Florian Tramer · Hamed Hassani · Eric Wong |
|
Poster
|
Thu 11:00 |
DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain Fengpeng Li · Kemou Li · Haiwei Wu · Jinyu Tian · Jiantao Zhou |
|
Poster
|
Fri 16:30 |
Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy Shengfang ZHAI · Huanran Chen · Yinpeng Dong · Jiajun Li · Qingni Shen · Yansong Gao · Hang Su · Yang Liu |
|
Poster
|
Thu 11:00 |
The Art of Saying No: Contextual Noncompliance in Language Models Faeze Brahman · Sachin Kumar · Vidhisha Balachandran · Pradeep Dasigi · Valentina Pyatkin · Abhilasha Ravichander · Sarah Wiegreffe · Nouha Dziri · Khyathi Chandu · Jack Hessel · Yulia Tsvetkov · Noah Smith · Yejin Choi · Hanna Hajishirzi |
|
Poster
|
Fri 16:30 |
FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge hanzhe li · Jiaran Zhou · Yuezun Li · Baoyuan Wu · Bin Li · Junyu Dong |
|
Poster
|
Wed 16:30 |
CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense Mingkun Zhang · Keping Bi · Wei Chen · Quanrun Chen · Jiafeng Guo · Xueqi Cheng |
|
Poster
|
Wed 16:30 |
Learning the Latent Causal Structure for Modeling Label Noise Yexiong Lin · Yu Yao · Tongliang Liu |
|
Poster
|
Thu 16:30 |
Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks Andy Zhou · Bo Li · Haohan Wang |