Skip to yearly menu bar Skip to main content


Persona Subgraphs: Discovering and Steering Persona-Specific Circuits in Language Models via Sparse Autoencoder Features

Arul Murugan

Abstract

Chat is not available.