Workshop
|
|
Self-supervised Multimodal Model for Astronomy
Mariia Rizhko · Joshua Bloom
|
|
Workshop
|
|
Can Vision-Language Models Replace Human Annotators: A Case Study with CelebA Dataset
Haoming Lu · Feifei Zhong
|
|
Workshop
|
|
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang · Xuweiyi Chen · Nikhil Madaan · Madhavan Iyengar · Shengyi Qian · David Fouhey · Joyce Chai
|
|
Workshop
|
|
Integrating Visual and Linguistic Instructions for Context-Aware Navigation Agents
Suhwan Choi · Yongjun Cho · Minchan Kim · Jaeyoon Jung · Myunchul Joe · Park Yu Been · Minseo Kim · Sungwoong Kim · Sungjae Lee · WHISEONG PARK · Jiwan Chung · Youngjae Yu
|
|
Workshop
|
|
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
Yuwei Zhang · Tong Xia · Aaqib Saeed · Cecilia Mascolo
|
|
Workshop
|
|
TurtleBench: A Visual Programming Benchmark in Turtle Geometry
Sina Rismanchian · Yasaman Razeghi · Sameer Singh · Shayan Doroudi
|
|
Workshop
|
Sun 8:25
|
Invited Talk by Tomas Pfister - Multimodal time series modeling
|
|
Workshop
|
Sun 10:35
|
Invited Talk by Christoph Bergmeir - Fundamental limitations of foundational forecasting models: The need for multimodality and rigorous evaluation
|
|
Workshop
|
Sun 14:15
|
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun · Alexander Schubert · Greg Goldgof · Zhiqing Sun · Tom Hartvigsen · Atul Butte · Ahmed Alaa
|
|
Workshop
|
Sat 10:30
|
Efficient Generative Multimodal Integration (EGMI): Enabling Audio Generation from Text-Image Pairs through Alignment with Large Language Models
Taemin Kim · Wooyeol Baek · Heeseok Oh
|
|
Workshop
|
|
RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents
Tomoyuki Kagaya · Thong Yuan · Yuxuan Lou · Panasonic Karlekar Jayashree · Panasonic Sugiri Pranata · Akira Kinose · Koki Oguri · Felix Wick · Yang You
|
|
Workshop
|
Sat 10:30
|
What do MLLMs hear? Examining the interaction between LLM and audio encoder components in Multimodal Large Language Models
Enis Çoban · Michael Mandel · Johanna Devaney
|
|