firstbacksecondback
14 Results
Workshop
|
Agentic Anomaly Detection for Shipping Alexander Timms · Abigail Langbridge · Fearghal O'Donncha |
||
Workshop
|
FEABench: Evaluating Language Models on Real World Physics Reasoning Ability Nayantara Mudur · Hao Cui · Subhashini Venugopalan · Paul Raccuglia · Michael Brenner · Peter Norgaard |
||
Workshop
|
Improving Decision-Making in Open-World Agents with Conformal Prediction and Monty Hall Harit Vishwakarma · Alan Mishler · Thomas Cook · Niccolo Dalmasso · Natraj Raman · Sumitra Ganesh |
||
Workshop
|
Sat 12:00 |
Monty Hall and Score Optimization in Conformal Prediction to Improve LLMs for MCQs Harit Vishwakarma · Alan Mishler · Thomas Cook · Niccolo Dalmasso · Natraj Raman · Sumitra Ganesh |
|
Poster
|
Thu 11:00 |
AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning Shirley Wu · Shiyu Zhao · Qian Huang · Kexin Huang · Michihiro Yasunaga · Kaidi Cao · Vassilis Ioannidis · Karthik Subbian · Jure Leskovec · James Zou |
|
Workshop
|
GTA: A Benchmark for General Tool Agents Jize Wang · Ma Zerun · Yining Li · Songyang Zhang · Cailian Chen · Kai Chen · Xinyi Le |
||
Workshop
|
Advancing Agentic Systems: Dynamic Task Decomposition, Tool Integration and Evaluation using Novel Metrics and Dataset Shankar Kumar Jeyakumar · Alaa Ahmad · Adrian Gabriel |
||
Affinity Event
|
Towards unearthing neglected climate innovative solutions using an LLM-based search tool César Quilodrán-Casas · Christopher Waite · Nicole Alhadeff · Diyona Dsouza · Cathal Hughes · Larissa Kunstel-Tabet · Alyssa Gilbert |
||
Poster
|
Wed 11:00 |
Make Your LLM Fully Utilize the Context Shengnan An · Zexiong Ma · Zeqi Lin · Nanning Zheng · Jian-Guang Lou · Weizhu Chen |
|
Workshop
|
HoneyComb: A Flexible LLM-Based Agent System for Materials Science Huan Zhang · Yu Song · Ziyu Hou · Santiago Miret · Bang Liu |
||
Workshop
|
Sat 16:39 |
HoneyComb: A Flexible LLM-Based Agent System for Materials Science Huan Zhang · Yu Song · Ziyu Hou · Santiago Miret · Bang Liu |
|
Workshop
|
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models Mengfei Liang · Archish Arun · Zekun Wu · CRISTIAN VILLALOBOS · Jonathan Lutch · Emre Kazim · Adriano Koshiyama · Philip Treleaven |