Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Reliable ML from Unreliable Data

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

Dani Roytburg ⋅ Matthew Nguyen ⋅ Matthew Bozoukov ⋅ Hongyu Fu ⋅ Jou Barzdukas ⋅ Narmeen Oozeer
2025 Poster
in
Workshop: Reliable ML from Unreliable Data

Abstract

Chat is not available.