Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Reliable ML from Unreliable Data
Sat, Dec 6, 2025 • 4:00 PM – 5:00 PM PST

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

Dani Roytburg ⋅ Matthew Nguyen ⋅ Matthew Bozoukov ⋅ Hongyu Fu ⋅ Jou Barzdukas ⋅ Narmeen Oozeer

Abstract

Chat is not available.