Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Towards Safe & Trustworthy Agents

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

Anton Xue · Avishree Khare · Rajeev Alur · Surbhi Goel · Eric Wong

Abstract

Chat is not available.