firstbacksecondback
29 Results
Poster
|
Wed 16:30 |
Evaluating Numerical Reasoning in Text-to-Image Models Ivana Kajić · Olivia Wiles · Isabela Albuquerque · Matthias Bauer · Su Wang · Jordi Pont-Tuset · Aida Nematzadeh |
|
Workshop
|
GSR-Bench: A Benchmark for Grounded Spatial Reasoning Evaluation via Multimodal LLMs Navid Rajabi · Jana Kosecka |
||
Workshop
|
FEABench: Evaluating Language Models on Real World Physics Reasoning Ability Nayantara Mudur · Hao Cui · Subhashini Venugopalan · Paul Raccuglia · Michael Brenner · Peter Norgaard |
||
Poster
|
Wed 11:00 |
MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning Yifan Jiang · jiarui zhang · Kexuan Sun · Zhivar Sourati · Kian Ahrabian · Kaixin Ma · Filip Ilievski · Jay Pujara |
|
Poster
|
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving Chang Gao · Haiyun Jiang · Deng Cai · Shuming Shi · Wai Lam |
||
Workshop
|
STEM-PoM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing Jiaru Zou · Qing Wang · Pratyush Thakur · Nickvash Kani |
||
Workshop
|
Sat 15:45 |
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal |
|
Workshop
|
FEABench: Evaluating Language Models on Real World Physics Reasoning Ability Nayantara Mudur · Hao Cui · Subhashini Venugopalan · Paul Raccuglia · Michael Brenner · Peter Norgaard |
||
Workshop
|
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal |
||
Workshop
|
Investigating Goal-Aligned and Empathetic Social Reasoning Strategies for Human-Like Social Intelligence in LLMs Anirudh Gajula · Raaghav Malik |
||
Workshop
|
Evaluating Interventional Reasoning Capabilities of Large Language Models Tejas Kasetty · Divyat Mahajan · Gintare Karolina Dziugaite · Alexandre Drouin · Dhanya Sridhar |
||
Workshop
|
Sat 15:45 |
Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection Giorgos Iacovides · Wuyang Zhou · Danilo Mandic |