Poster
|
Fri 11:00
|
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Yuxuan Tong · Xiwen Zhang · Rui Wang · Ruidong Wu · Junxian He
|
|
Poster
|
Thu 16:30
|
ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty
Xindi Wu · Dingli Yu · Yangsibo Huang · Olga Russakovsky · Sanjeev Arora
|
|
Poster
|
Thu 16:30
|
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Mucong Ding · Chenghao Deng · Jocelyn Choo · Zichu Wu · Aakriti Agrawal · Avi Schwarzschild · Tianyi Zhou · Tom Goldstein · John Langford · Animashree Anandkumar · Furong Huang
|
|
Workshop
|
|
GenAI Evaluation Maturity Framework (GEMF) to assess and improve GenAI Evaluations
Yilin Zhang · Frank J. Kanayet
|
|
Workshop
|
|
Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection
Shantanu Thorat · Tianbao Yang
|
|
Workshop
|
|
ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty
Xindi Wu · Dingli Yu · Yangsibo Huang · Olga Russakovsky · Sanjeev Arora
|
|