Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

81 Results

<<   <   Page 5 of 7   >   >>
Poster
Wed 16:30 RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
Dongyu Ru · Lin Qiu · Xiangkun Hu · Tianhang Zhang · Peng Shi · Shuaichen Chang · Cheng Jiayang · Cunxiang Wang · Shichao Sun · Huanyu Li · Zizhao Zhang · Binjie Wang · Jiarong Jiang · Tong He · Zhiguo Wang · Pengfei Liu · Yue Zhang · Zheng Zhang
Workshop
Sat 12:00 CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++?
Vaishnavi Bhargava · Rajat Ghosh · Debojyoti Dutta
Workshop
Sat 15:45 MarkMyWords: Analyzing and Evaluating Language Model Watermarks
Julien Piet · Chawin Sitawarin · Vivian Fang · Norman Mu · David Wagner
Poster
Wed 16:30 MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
Yubo Ma · Yuhang Zang · Liangyu Chen · Meiqi Chen · Yizhu Jiao · Xinze Li · Xinyuan Lu · Ziyu Liu · Yan Ma · Xiaoyi Dong · Pan Zhang · Liangming Pan · Yu-Gang Jiang · Jiaqi Wang · Yixin Cao · Aixin Sun
Poster
Wed 11:00 Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics
Lukas Klein · Carsten Lüth · Udo Schlegel · Till Bungert · Mennatallah El-Assady · Paul Jaeger
Poster
Thu 11:00 WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks
Michael Wornow · Avanika Narayan · Ben Viggiano · Ishan Khare · Tathagat Verma · Tibor Thompson · Miguel Hernandez · Sudharsan Sundar · Chloe Trujillo · Krrish Chawla · Rongfei Lu · Justin Shen · Divya Nagaraj · Joshua Martinez · Vardhan Agrawal · Althea Hudson · Nigam Shah · Christopher Ré
Poster
Fri 11:00 CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding
Emanuele Vivoli · Marco Bertini · Dimosthenis Karatzas
Poster
NN4SysBench: Characterizing Neural Network Verification for Computer Systems
Shuyi Lin · Haoyu He · Tianhao WEI · Kaidi Xu · Huan Zhang · Gagandeep Singh · Changliu Liu · Cheng Tan
Poster
Thu 16:30 ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan · Jinfa Huang · Yongqi Xu · YaoYang Liu · Shaofeng Zhang · Yujun Shi · Rui-Jie Zhu · Xinhua Cheng · Jiebo Luo · Li Yuan
Workshop
Sat 11:45 MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Wenqian Ye · Guangtao Zheng · Yunsheng Ma · Xu Cao · Bolin Lai · James Rehg · Aidong Zhang
Poster
Thu 16:30 Benchmarking Complex Instruction-Following with Multiple Constraints Composition
Bosi Wen · Pei Ke · Xiaotao Gu · Lindong Wu · Hao Huang · Jinfeng Zhou · Wenchuang Li · Binxin Hu · Wendy Gao · Jiaxing Xu · Yiming Liu · Jie Tang · Hongning Wang · Minlie Huang
Poster
Fri 16:30 EgoSim: An Egocentric Multi-view Simulator and Real Dataset for Body-worn Cameras during Motion and Activity
Dominik Hollidt · Paul Streli · Jiaxi Jiang · Yasaman Haghighi · Changlin Qian · Xintong Liu · Christian Holz