Skip to yearly menu bar Skip to main content


Probe-Rewrite-Evaluate: A Workflow for Reliable Benchmarks and Quantifying Evaluation Awareness

Lang Xiong · Nishant Bhargava · Jeremy Chang · Jianhang Hong · Haihao Liu · Vasu Sharma · Kevin Zhu

Abstract

Chat is not available.