Poster
in
Workshop: CogInterp: Interpreting Cognition in Deep Learning Models

Measuring LLM Generation Spaces with EigenScore

Sunny Yu · Myra Cheng · Ahmad Jabbar · Robert Hawkins · Dan Jurafsky

Project Page [ OpenReview]

Abstract

An LLM's generation space for a given prompt --- the range of semantically distinct outputs it could produce --- provides a window into the model's implicit task representation. We currently lack a metric for characterizing this space. In this work, we argue that the EigenScore metric (originally developed for hallucination detection) captures the size of this generation space. To develop this understanding, we construct synthetic datasets of prompt pairs with known generation space relationships (complement, subset, etc.). We show that EigenScore reliably predicts a prompt’s generation space size, outperforming other metrics like perplexity and entropy. We provide further evidence for this understanding of EigenScore by showing a strong connection between EigenScore and the length of reasoning tokens for the same prompt. Our work uses EigenScore to contribute a cognitive understanding of a model's generation space size and how it relates to reasoning abilities of LLMs.

Chat is not available.