Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

344 Results

<<   <   Page 28 of 29   >   >>
Workshop
Self-supervised Multimodal Model for Astronomy
Mariia Rizhko · Joshua Bloom
Workshop
Can Vision-Language Models Replace Human Annotators: A Case Study with CelebA Dataset
Haoming Lu · Feifei Zhong
Workshop
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang · Xuweiyi Chen · Nikhil Madaan · Madhavan Iyengar · Shengyi Qian · David Fouhey · Joyce Chai
Workshop
Integrating Visual and Linguistic Instructions for Context-Aware Navigation Agents
Suhwan Choi · Yongjun Cho · Minchan Kim · Jaeyoon Jung · Myunchul Joe · Park Yu Been · Minseo Kim · Sungwoong Kim · Sungjae Lee · WHISEONG PARK · Jiwan Chung · Youngjae Yu
Workshop
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
Yuwei Zhang · Tong Xia · Aaqib Saeed · Cecilia Mascolo
Workshop
TurtleBench: A Visual Programming Benchmark in Turtle Geometry
Sina Rismanchian · Yasaman Razeghi · Sameer Singh · Shayan Doroudi
Workshop
Sun 8:25 Invited Talk by Tomas Pfister - Multimodal time series modeling
Workshop
Sun 10:35 Invited Talk by Christoph Bergmeir - Fundamental limitations of foundational forecasting models: The need for multimodality and rigorous evaluation
Workshop
Sun 14:15 Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun · Alexander Schubert · Greg Goldgof · Zhiqing Sun · Tom Hartvigsen · Atul Butte · Ahmed Alaa
Workshop
Sat 10:30 Efficient Generative Multimodal Integration (EGMI): Enabling Audio Generation from Text-Image Pairs through Alignment with Large Language Models
Taemin Kim · Wooyeol Baek · Heeseok Oh
Workshop
RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents
Tomoyuki Kagaya · Thong Yuan · Yuxuan Lou · Panasonic Karlekar Jayashree · Panasonic Sugiri Pranata · Akira Kinose · Koki Oguri · Felix Wick · Yang You
Workshop
Sat 10:30 What do MLLMs hear? Examining the interaction between LLM and audio encoder components in Multimodal Large Language Models
Enis Çoban · Michael Mandel · Johanna Devaney