firstbacksecondback
210 Results
Poster
|
Thu 11:00 |
Rethinking LLM Memorization through the Lens of Adversarial Compression Avi Schwarzschild · Zhili Feng · Pratyush Maini · Zachary Lipton · J. Zico Kolter |
|
Poster
|
Wed 16:30 |
AHA: Human-Assisted Out-of-Distribution Generalization and Detection Haoyue Bai · Jifan Zhang · Robert Nowak |
|
Poster
|
Fri 11:00 |
Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems Rohan Paleja · Michael Munje · Kimberlee Chang · Reed Jensen · Matthew Gombolay |
|
Affinity Event
|
Mitigating Bias in Queer Representation within Large Language Models: A Collaborative Agent Approach Tianyi Huang · Arya Somasundaram |
||
Poster
|
Thu 11:00 |
MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models Tessa Han · Aounon Kumar · Chirag Agarwal · Himabindu Lakkaraju |
|
Poster
|
Fri 11:00 |
MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models Yichi Zhang · Yao Huang · Yitong Sun · Chang Liu · Zhe Zhao · Zhengwei Fang · Yifan Wang · Huanran Chen · Xiao Yang · Xingxing Wei · Hang Su · Yinpeng Dong · Jun Zhu |
|
Poster
|
Fri 16:30 |
CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models Peng Xia · Ze Chen · Juanxi Tian · Yangrui Gong · Ruibo Hou · Yue Xu · Zhenbang Wu · Zhiyuan Fan · Yiyang Zhou · Kangyu Zhu · Wenhao Zheng · Zhaoyang Wang · Xiao Wang · Xuchao Zhang · Chetan Bansal · Marc Niethammer · Junzhou Huang · Hongtu Zhu · Yun Li · Jimeng Sun · Zongyuan Ge · Gang Li · James Zou · Huaxiu Yao |
|
Poster
|
Thu 11:00 |
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs Zhao Xu · Fan LIU · Hao Liu |
|
Poster
|
Fri 16:30 |
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models Yibo Miao · Yifan Zhu · Lijia Yu · Jun Zhu · Xiao-Shan Gao · Yinpeng Dong |
|
Poster
|
Thu 16:30 |
PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations Jiatong Li · Renjun Hu · Kunzhe Huang · Yan Zhuang · Qi Liu · Mengxiao Zhu · Xing Shi · Wei Lin |
|
Poster
|
Wed 16:30 |
Evaluating Copyright Takedown Methods for Language Models Boyi Wei · Weijia Shi · Yangsibo Huang · Noah Smith · Chiyuan Zhang · Luke Zettlemoyer · Kai Li · Peter Henderson |
|
Poster
|
Thu 16:30 |
Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks Andy Zhou · Bo Li · Haohan Wang |