NeurIPS 2024

Skip to yearly menu bar Skip to main content

6 Results

Poster	Fri 11:00	DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving Yuxuan Tong · Xiwen Zhang · Rui Wang · Ruidong Wu · Junxian He
Poster	Thu 16:30	ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Xindi Wu · Dingli Yu · Yangsibo Huang · Olga Russakovsky · Sanjeev Arora
Poster	Thu 16:30	Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization Mucong Ding · Chenghao Deng · Jocelyn Choo · Zichu Wu · Aakriti Agrawal · Avi Schwarzschild · Tianyi Zhou · Tom Goldstein · John Langford · Animashree Anandkumar · Furong Huang
Workshop		GenAI Evaluation Maturity Framework (GEMF) to assess and improve GenAI Evaluations Yilin Zhang · Frank J. Kanayet
Workshop		Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection Shantanu Thorat · Tianbao Yang
Workshop		ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Xindi Wu · Dingli Yu · Yangsibo Huang · Olga Russakovsky · Sanjeev Arora