Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Scaling Environments for Agents
Sun, Dec 7, 2025 • 12:30 PM – 1:30 PM PST

DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments

Chiyu Zhang · Marc-Alexandre Côté · Michael Albada · Anush Sankaran · Jack Stokes · Tong Wang · Amir Abdi · William Blum · Muhammad Abdul-Mageed

Abstract

Chat is not available.