Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

39 Results

<<   <   Page 3 of 4   >   >>
Workshop
Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning
Alex Beutel · Kai Xiao · Johannes Heidecke · Lilian Weng
Workshop
Decompose, Recompose, and Conquer: Multi-modal LLMs are Vulnerable to Compositional Adversarial Attacks in Multi-Image Queries
Julius Broomfield · George Ingebretsen · Reihaneh Iranmanesh · Sara Pieri · Ethan Kosak-Hine · Tom Gibbs · Reihaneh Rabbany · Kellin Pelrine
Workshop
Aligning to What? Limits to RLHF Based Alignment
Logan Barnhart · Reza Akbarian Bafghi · Maziar Raissi · Stephen Becker
Workshop
Contextual evaluation of Large Language Models for Classifying Tropical and Infectious Diseases
Mercy Asiedu · Nenad Tomasev · Chintan Ghate · Tiya Tiyasirichokchai · Awa Dieng · Oluwatosin Akande · Geoffrey Siwo · Steve Adudans · Sylvanus Aitkins · Odianosen Ehiakhamen · Eric Ndombi · Katherine Heller
Workshop
Imitation guided Automated Red Teaming
Desik Rengarajan · Sajad Mousavi · Ashwin Ramesh Babu · Vineet Gundecha · Avisek Naug · Sahand Ghorbanpour · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Soumyendu Sarkar
Workshop
LLM-Assisted Red Teaming of Diffusion Models through "Failures Are Fated, But Can Be Faded"
Som Sagar · Aditya Taparia · Ransalu Senanayake
Workshop
SkewAct: Red Teaming Large Language Models via Activation-Skewed Adversarial Prompt Optimization
Hanxi Guo · Siyuan Cheng · Guanhong Tao · Guangyu Shen · Zhuo Zhang · Shengwei An · Kaiyuan Zhang · Xiangyu Zhang
Competition
Sun 10:00 Competition Overview and Invited Team Talks
Chandler Smith
Workshop
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI
Ambrish Rawat · Stefan Schoepf · Giulio Zizzo · Giandomenico Cornacchia · Muhammad Zaid Hameed · Kieran Fraser · Erik Miehling · Beat Buesser · Elizabeth Daly · Mark Purcell · Prasanna Sattigeri · Pin-Yu Chen · Kush Varshney
Workshop
Lessons From Red Teaming 100 Generative AI Products
Blake Bullwinkel · Amanda Minnich · Shiven Chawla · Gary Lopez Munoz · Martin Pouliot · Whitney Maxwell · Joris de Gruyter · Katherine Pratt · Saphir Qi · Nina Chikanov · Roman Lutz · Raja Sekhar Rao Dheekonda · Bolor-Erdene Jagdagdorj · Rich Lundeen · Sam Vaughan · Victoria Westerhoff · Pete Bryan · Ram Shankar Siva Kumar · Yonatan Zunger · Mark Russinovich
Workshop
SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming
Anurakt Kumar · Divyanshu Kumar · Jatan Loya · Nitin Aravind Birur · Tanay Baswa · Sahil Agarwal · Prashanth Harshangi
Poster
Wed 16:30 Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training
Kristjan Greenewald · Yuancheng Yu · Hao Wang · Kai Xu