Skip to yearly menu bar Skip to main content


Liminal Training: Characterizing and Mitigating Subliminal Learning in Large Language Models

Atsushi Yanagisawa ⋅ Akbarzaib Khan ⋅ Thanjeetraaj Kaur Balraj Singh ⋅ Yunjong Na ⋅ Kevin Zhu ⋅ Antonio Mari

Abstract

Chat is not available.