NeurIPS 2024

Poster

Wed 11:00

BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yury Kuratov · Aydar Bulatov · Petr Anokhin · Ivan Rodkin · Dmitry Sorokin · Artyom Sorokin · Mikhail Burtsev

Workshop

WILT: A Multi-turn, Memorization-Robust Inductive Logic Benchmark for LLMs
Eryk Banatt · Jonathan Cheng · Tiffany Hwu

Workshop

MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula
Shubhra Mishra · Gabriel Poesia · Belinda Mo · Noah Goodman

Workshop

STEM-PoM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing
Jiaru Zou · Qing Wang · Pratyush Thakur · Nickvash Kani

Workshop

Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning
Aryan Gulati · Brando Miranda · Eric Chen · Emily Xia · Kai Fronsdal · Bruno de Moraes Dumont · Sanmi Koyejo

Workshop

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Yihe Deng · Paul Mineiro

Workshop

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
Xiaotian Han · Yiren Jian · Xuefeng Hu · Haogeng Liu · Yiqi Wang · Qihang Fan · Yuang Ai · Huaibo Huang · Ran He · Zhenheng Yang · Quanzeng You

Workshop

OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
Shubham Toshniwal · Wei Du · Ivan Moshkov · Branislav Kisacanin · Alexan Ayrapetyan · Igor Gitman

Workshop

Learning Mathematical Rules with Large Language Models
Antoine Gorceix · Bastien Le Chenadec · Ahmad Rammal · Nelson Vadori · Manuela Veloso

Workshop

VinePPO: Accurate Credit Assignment in RL for LLM Mathematical Reasoning
Amirhossein Kazemnejad · Milad Aghajohari · Eva Portelance · Alessandro Sordoni · Siva Reddy · Aaron Courville · Nicolas Le Roux

Workshop

DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students’ Hand-Drawn Math Images
Sami Baral · Li Lucy · Ryan Knight · Alice Ng · Luca Soldaini · Neil Heffernan · Kyle Lo

Workshop

AI-Assisted Generation of Difficult Math Questions
Vedant Shah · Dingli Yu · Kaifeng Lyu · Simon Park · Jiatong Yu · Yinghui He · Nan Rosemary Ke · Michael Mozer · Yoshua Bengio · Sanjeev Arora · Anirudh Goyal

Main Navigation

36 Results