firstbacksecondback
81 Results
Poster
|
Wed 11:00 |
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding Houlun Chen · Xin Wang · Hong Chen · Zeyang Zhang · Wei Feng · Bin Huang · Jia Jia · Wenwu Zhu |
|
Poster
|
Wed 16:30 |
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents Ma Chang · Junlei Zhang · Zhihao Zhu · Cheng Yang · Yujiu Yang · Yaohui Jin · Zhenzhong Lan · Lingpeng Kong · Junxian He |
|
Poster
|
Thu 11:00 |
MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning Stella Li · Vidhisha Balachandran · Shangbin Feng · Jonathan Ilgen · Emma Pierson · Pang Wei Koh · Yulia Tsvetkov |
|
Poster
|
Thu 11:00 |
The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track Eshta Bhardwaj · Harshit Gujral · Siyi Wu · Ciara Zogheib · Tegan Maharaj · Christoph Becker |
|
Poster
|
Wed 11:00 |
A Large-Scale Human-Centric Benchmark for Referring Expression Comprehension in the LMM Era Fangyun Wei · Jinjing Zhao · Kun Yan · Hongyang Zhang · Chang Xu |
|
Poster
|
Fri 16:30 |
Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox Xingming Long · Jie Zhang · Shiguang Shan · Xilin Chen |
|
Poster
|
Fri 16:30 |
NanoBaseLib: A Multi-Task Benchmark Dataset for Nanopore Sequencing Guangzhao Cheng · Chengbo Fu · Lu Cheng |
|
Poster
|
Thu 16:30 |
Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and Toolbox Haohui Wang · Weijie Guan · Chen Jianpeng · Zi Wang · Dawei Zhou |
|
Workshop
|
Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning Aryan Gulati · Brando Miranda · Eric Chen · Emily Xia · Kai Fronsdal · Bruno de Moraes Dumont · Sanmi Koyejo |
||
Poster
|
Thu 11:00 |
TAPVid-3D: A Benchmark for Tracking Any Point in 3D Skanda Koppula · Ignacio Rocco · Yi Yang · joseph heyward · Joao Carreira · Andrew Zisserman · Gabriel Brostow · Carl Doersch |
|
Poster
|
Fri 16:30 |
PEACE: A Dataset of Pharmaceutical Care for Cancer Pain Analgesia Evaluation and Medication Decision Yutao Dou · Huimin Yu · Wei Li · Jingyang Li · Fei Xia · Jian Xiao |
|
Poster
|
Wed 16:30 |
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models Yuchen Ren · Zhiyuan Chen · Lifeng Qiao · Hongtai Jing · Yuchen Cai · Sheng Xu · Peng Ye · Xinzhu Ma · Siqi Sun · Hongliang Yan · Dong Yuan · Wanli Ouyang · Xihui Liu |