firstbacksecondback
34 Results
Workshop
|
Sat 17:27 |
Imitation Guided Automated Red Teaming Sajad Mousavi · Desik Rengarajan · Ashwin Ramesh Babu · Vineet Gundecha · Soumyendu Sarkar |
|
Workshop
|
Decompose, Recompose, and Conquer: Multi-modal LLMs are Vulnerable to Compositional Adversarial Attacks in Multi-Image Queries Julius Broomfield · George Ingebretsen · Reihaneh Iranmanesh · Sara Pieri · Ethan Kosak-Hine · Tom Gibbs · Reihaneh Rabbany · Kellin Pelrine |
||
Workshop
|
Aligning to What? Limits to RLHF Based Alignment Logan Barnhart · Reza Akbarian Bafghi · Maziar Raissi · Stephen Becker |
||
Workshop
|
Imitation guided Automated Red Teaming Desik Rengarajan · Sajad Mousavi · Ashwin Ramesh Babu · Vineet Gundecha · Avisek Naug · Sahand Ghorbanpour · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Soumyendu Sarkar |
||
Workshop
|
Contextual evaluation of Large Language Models for Classifying Tropical and Infectious Diseases Mercy Asiedu · Nenad Tomasev · Chintan Ghate · Tiya Tiyasirichokchai · Awa Dieng · Oluwatosin Akande · Geoffrey Siwo · Steve Adudans · Sylvanus Aitkins · Odianosen Ehiakhamen · Eric Ndombi · Katherine Heller |
||
Workshop
|
LLM-Assisted Red Teaming of Diffusion Models through "Failures Are Fated, But Can Be Faded" Som Sagar · Aditya Taparia · Ransalu Senanayake |
||
Workshop
|
SkewAct: Red Teaming Large Language Models via Activation-Skewed Adversarial Prompt Optimization Hanxi Guo · Siyuan Cheng · Guanhong Tao · Guangyu Shen · Zhuo Zhang · Shengwei An · Kaiyuan Zhang · Xiangyu Zhang |
||
Workshop
|
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI Ambrish Rawat · Stefan Schoepf · Giulio Zizzo · Giandomenico Cornacchia · Muhammad Zaid Hameed · Kieran Fraser · Erik Miehling · Beat Buesser · Elizabeth Daly · Mark Purcell · Prasanna Sattigeri · Pin-Yu Chen · Kush Varshney |
||
Workshop
|
Lessons From Red Teaming 100 Generative AI Products Blake Bullwinkel · Amanda Minnich · Shiven Chawla · Gary Lopez Munoz · Martin Pouliot · Whitney Maxwell · Joris de Gruyter · Katherine Pratt · Saphir Qi · Nina Chikanov · Roman Lutz · Raja Sekhar Rao Dheekonda · Bolor-Erdene Jagdagdorj · Rich Lundeen · Sam Vaughan · Victoria Westerhoff · Pete Bryan · Ram Shankar Siva Kumar · Yonatan Zunger · Mark Russinovich |
||
Workshop
|
SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming Anurakt Kumar · Divyanshu Kumar · Jatan Loya · Nitin Aravind Birur · Tanay Baswa · Sahil Agarwal · Prashanth Harshangi |