Persona Trait Independence in Activation Steering Depends on Framework Design
Abstract
Activation steering uses trait vectors to control LLM behavior at inference time, but when multiple traits are extracted, are they independent? We present a geometric analysis of three trait frameworks across three model families. We find that trait independence is primarily a property of the framework's design rather than the model: psychometrically-designed frameworks (OCEAN, MBTI) yield nearly orthogonal vectors (97-99\% effective rank), while the ad-hoc trait collection we studied is substantially correlated (85\% effective rank). Surprisingly, Gram-Schmidt orthogonalization eliminates geometric crosstalk but produces negligible behavioral change, revealing a geometry-behavior gap. Our results suggest that compositional steering requires principled trait libraries and that geometric properties alone are insufficient to predict behavioral interactions.