NeurIPS Competition CLAS 2024: The Competition for LLM and Agent Safety

Competition

CLAS 2024: The Competition for LLM and Agent Safety

Zhen Xiang · Yi Zeng · Mintong Kang · Chejian Xu · Jiawei Zhang · Zhuowen Yuan · Zhaorun Chen · Chulin Xie · Fengqing Jiang · Minzhou Pan · Francesco Pinto · Junyuan Hong · Ruoxi Jia · Radha Poovendran · Bo Li

West Meeting Room 210

[ Abstract ]

[ OpenReview]

Sun 15 Dec 1:30 p.m. PST — 4:30 p.m. PST

Abstract:

Ensuring safety emerges as a pivotal objective in developing large language models(LLMs) and LLM-powered agents. The Competition for LLM and Agent Safety(CLAS) aims to advance the understanding of the vulnerabilities in LLMs andLLM-powered agents and to encourage methods for improving their safety. Thecompetition features three main tracks linked through the methodology of promptinjection, with tasks designed to amplify societal impact by involving practicaladversarial objectives for different domains. In the Jailbreaking Attack track,participants are challenged to elicit harmful outputs in guardrail LLMs via promptinjection. In the Backdoor Trigger Recovery for Models track, participants aregiven a CodeGen LLM embedded with hundreds of domain-specific backdoors.They are asked to reverse-engineer the trigger for each given target. In the Back-door Trigger Recovery for Agents track, trigger reverse engineering will befocused on eliciting specific backdoor targets based on malicious agent actions. Asthe first competition addressing the safety of both LLMs and LLM agents, CLAS2024 aims to foster collaboration between various communities promoting researchand tools for enhancing the safety of LLMs and real-world AI systems.

Chat is not available.

Schedule

Sun 1:30 p.m. - 2:00 p.m.	Opening, Introduction, and Competition Summary SlidesLive Video	🔗
Sun 2:00 p.m. - 2:25 p.m.	Invited talk: Furong Huang ( Invited talk ) > SlidesLive Video	🔗
Sun 2:25 p.m. - 2:35 p.m.	Announcing winners	🔗
Sun 2:35 p.m. - 3:25 p.m.	Description of winning methods SlidesLive Video	🔗
Sun 3:25 p.m. - 3:50 p.m.	Invited talk: Yu Su ( Invited talk ) > SlidesLive Video	🔗
Sun 3:50 p.m. - 4:00 p.m.	Break	🔗
Sun 4:00 p.m. - 4:25 p.m.	Invited talk: Chaowei Xiao ( Invited talk ) > SlidesLive Video	🔗
Sun 4:25 p.m. - 4:30 p.m.	Wrapping up SlidesLive Video	🔗