Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

73 Results

<<   <   Page 1 of 7   >   >>
Workshop
Sun 11:15 Invited Talk 3: (Been Kim, Senior Staff Research Scientist, Google Deepmind)
Zaina Shaik
Workshop
Representation Tuning
Christopher Ackerman
Workshop
Sat 15:45 Formal Analysis and Unification of Generalization in Deep Reinforcement Learning
Ezgi Korkmaz
Affinity Event
Tue 14:00 Invited Talk 2 by Lama Ahmad (Technical Program Manager, Trustworthy AI at OpenAI): Human and AI Evaluations for Safety and Robustness Testing
Lama Ahmad
Workshop
Sun 17:00 Invited Talk 7: Max Kaufmann on Red-teaming AI systems in government
Max Kaufmann
Affinity Event
Position Paper: The Urgent Need for Advancements in Machine Unlearning Algorithms to Ensure AI Safety
Yashaswini Viswanath · Vishwanath Hulipalled · Kaustubha Vecham ·
Workshop
Plentiful Jailbreaks with String Compositions
Brian Huang
Workshop
Plentiful Jailbreaks with String Compositions
Brian Huang
Workshop
Does Refusal Training in LLMs Generalize to the Past Tense?
Maksym Andriushchenko · Nicolas Flammarion
Workshop
Sat 12:00 Weak-to-Strong Confidence Prediction
Yukai Yang · Tracy Zhu · Marco Morucci · Tim G. J. Rudner
Workshop
Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs
Megh Thakkar · Yash More · Quentin Fournier · Matthew Riemer · Pin-Yu Chen · Amal Zouaq · Payel Das · Sarath Chandar
Workshop
AIR-Bench 2024: Safety Evaluation Based on Risk Categories from Regulations and Policies
Kevin Klyman