Workshop
|
|
Evaluations Using Wikipedia without Data Leakage: From Trusting Articles to Trusting Edit Processes
Lucie-Aimée Kaffee · Isaac Johnson
|
|
Workshop
|
|
Evaluating Explanations Through LLMs: Beyond Traditional User Studies
Francesco Bombassei De Bona · Gabriele Dominici · Tim Miller · Marc Langheinrich · Martin Gjoreski
|
|
Workshop
|
Sat 8:15
|
GenAI for Health: Potential, Trust and Policy Compliance
Junyuan Hong · Pranav Rajpurkar · Jason Fries · Marina Sirota · Ying Ding
|
|
Workshop
|
|
ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Mohita Chowdhury · Yajie He · Ernest Lim · Aisling Higham
|
|
Workshop
|
|
Trust but Verify: Reliable VLM evaluation in-the-wild with program synthesis
Viraj Uday Prabhu · Senthil Purushwalkam · Jieyu Zhang · An Yan · Caiming Xiong · Ran Xu
|
|
Workshop
|
|
TR-BEACON: Shedding Light on Efficient Behavior Discovery in High-Dimensional Spaces with Bayesian Novelty Search over Trust Regions
Wei-Ting Tang · Ankush Chakrabarty · Joel Paulson
|
|
Workshop
|
Sat 12:00
|
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees
Yu Gui · Ying Jin · Zhimei Ren
|
|
Workshop
|
Sat 15:45
|
Diffusion-Powered Image Super-Resolution That You Can Actually Trust
Daniel Csillag · Eduardo Adame · Guilherme Tegoni Goedert
|
|