firstbacksecondback
36 Results
Affinity Event
|
Reasoning-Driven Jury System for LLM Evaluation Ayda Sultan |
||
Workshop
|
LLM Hallucination Reasoning with Zero-shot Knowledge Test Seongmin Lee · Hsiang Hsu · Richard Chen |
||
Workshop
|
What do Learning Dynamics Reveal about Generalization in LLM Reasoning? Yijun Kang · Amrith Setlur · Dibya Ghosh · Jacob Steinhardt · Claire Tomlin · Sergey Levine · Aviral Kumar |
||
Poster
|
Fri 16:30 |
Enhancing LLM Reasoning via Vision-Augmented Prompting Ziyang Xiao · Dongxiang Zhang · Xiongwei Han · Xiaojin Fu · Wing Yin YU · Tao Zhong · Sai Wu · Yuan Wang · Jianwei Yin · Gang Chen |
|
Poster
|
Thu 11:00 |
Detecting Bugs with Substantial Monetary Consequences by LLM and Rule-based Reasoning Brian Zhang · Zhuo Zhang |
|
Poster
|
Thu 11:00 |
AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning Shirley Wu · Shiyu Zhao · Qian Huang · Kexin Huang · Michihiro Yasunaga · Kaidi Cao · Vassilis Ioannidis · Karthik Subbian · Jure Leskovec · James Zou |
|
Workshop
|
Not All LLM Reasoners Are Created Equal Arian Hosseini · Alessandro Sordoni · Daniel Toyama · Aaron Courville · Rishabh Agarwal |
||
Poster
|
Wed 16:30 |
Self-playing Adversarial Language Game Enhances LLM Reasoning Pengyu Cheng · Tianhao Hu · Han Xu · Zhisong Zhang · Yong Dai · Lei Han · nan du · Xiaolong Li |
|
Workshop
|
Agentic Anomaly Detection for Shipping Alexander Timms · Abigail Langbridge · Fearghal O'Donncha |
||
Workshop
|
Not All LLM Reasoners Are Created Equal Arian Hosseini · Alessandro Sordoni · Daniel Toyama · Aaron Courville · Rishabh Agarwal |
||
Poster
|
Fri 16:30 |
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold Amrith Setlur · Saurabh Garg · Xinyang Geng · Naman Garg · Virginia Smith · Aviral Kumar |
|
Workshop
|
MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs Saeid Asgari · Aliasghar Khani · Amir Khasahmadi |