Poster
|
Thu 16:30
|
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia · Jiancheng Liu · Yihua Zhang · Parikshit Ram · Nathalie Baracaldo · Sijia Liu
|
|
Poster
|
Wed 11:00
|
Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
Anh Bui · Tung-Long Vuong · Khanh Doan · Trung Le · Paul Montague · Tamas Abraham · Dinh Phung
|
|
Poster
|
Thu 16:30
|
Differentially Private Stochastic Gradient Descent with Fixed-Size Minibatches: Tighter RDP Guarantees with or without Replacement
Jeremiah Birrell · Reza Ebrahimi · Rouzbeh Behnia · Jason Pacheco
|
|
Poster
|
Wed 11:00
|
Studying How to Efficiently and Effectively Guide Models with Explanations - A Reproducibility Study
Adrian Sauter · Milan Miletić · Ryan Ott · Rohith Prabakaran
|
|
Social
|
Thu 19:30
|
Space and AI
Anne Spalding · Gabriel Sutherland · Alexander Lavin
|
|
Poster
|
Fri 11:00
|
PrivAuditor: Benchmarking Data Protection Vulnerabilities in LLM Adaptation Techniques
Derui Zhu · Dingfan Chen · Xiongfei Wu · Jiahui Geng · Zhuo Li · Jens Grossklags · Lei Ma
|
|
Affinity Event
|
|
Learning to Reweight Examples in Backdoor Defense
Yufan Feng · Benjamin Tan · Yani Ioannou
|
|
Poster
|
Wed 11:00
|
An Analysis of Robustness of Non-Lipschitz Networks
Maria-Florina Balcan · Avrim Blum · Dravyansh Sharma · Hongyang Zhang
|
|
Affinity Event
|
|
Improving Harm Reduction Tactics for a Multimodal Software Solution
Lucia Berger
|
|
Expo Talk Panel
|
Tue 16:00
|
Symbolic AI and Foundation Models Integration towards Reliable and Trustworthy Industry-grade AI Systems
Vítor Lourenço · Vítor Lourenço · Audrey Depeige · Audrey Depeige · Charles Ivie · Ora Lassila · George Karypis
|
|
Affinity Event
|
Tue 14:00
|
Invited Talk 2 by Lama Ahmad (Technical Program Manager, Trustworthy AI at OpenAI): Human and AI Evaluations for Safety and Robustness Testing
Lama Ahmad
|
|
Poster
|
Thu 16:30
|
Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks
Andy Zhou · Bo Li · Haohan Wang
|
|