firstbacksecondback
2 Results
Poster
|
Thu 11:00 |
WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia Yufang Hou · Alessandra Pascale · Javier Carnerero-Cano · Tigran Tchrakian · Radu Marinescu · Elizabeth Daly · Inkit Padhi · Prasanna Sattigeri |
|
Workshop
|
Evaluations Using Wikipedia without Data Leakage: From Trusting Articles to Trusting Edit Processes Lucie-Aimée Kaffee · Isaac Johnson |