Timezone: »
Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAFEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies through two example case studies on language and visual attributions.
Author Information
Vidhi Jain (Carnegie Mellon University)
Vidhi Jain is a Robotics Ph.D. student advised by Yonatan Bisk (CMU, LTI). She is interested in the learning multimodal policies for embodied AI and robots for complex everyday tasks.
Jayant Sravan Tamarapalli (Carnegie Mellon University)
Sahiti Yerramilli (Carnegie Mellon University)
Yonatan Bisk (LTI @ CMU)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 : MAFEA: Multimodal Attribution Framework for Embodied AI »
Fri. Dec 2nd 03:42 -- 03:48 PM Room
More from the Same Authors
-
2022 : Tackling AlfWorld with Action Attention and Common Sense from Language Models »
Yue Wu · So Yeon Min · Yonatan Bisk · Russ Salakhutdinov · Shrimai Prabhumoye -
2022 : MAEA: Multimodal Attribution Framework for Embodied AI »
Vidhi Jain · Jayant Sravan Tamarapalli · Sahiti Yerramilli · Yonatan Bisk -
2022 : MAEA: Multimodal Attribution Framework for Embodied AI »
Vidhi Jain · Jayant Sravan Tamarapalli · Sahiti Yerramilli · Yonatan Bisk -
2023 Poster: SPRING: Studying Papers and Reasoning to play Games »
Yue Wu · So Yeon Min · Shrimai Prabhumoye · Yonatan Bisk · Russ Salakhutdinov · Amos Azaria · Tom Mitchell · Yuanzhi Li -
2023 Poster: SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs »
Lijun Yu · Yong Cheng · Zhiruo Wang · Vivek Kumar · Wolfgang Macherey · Yanping Huang · David Ross · Irfan Essa · Yonatan Bisk · Ming-Hsuan Yang · Kevin Murphy · Alexander Hauptmann · Lu Jiang -
2023 Competition: The HomeRobot Open Vocabulary Mobile Manipulation Challenge »
Sriram Yenamandra · Arun Ramachandran · Mukul Khanna · Karmesh Yadav · Devendra Singh Chaplot · Gunjan Chhablani · Alexander Clegg · Theophile Gervet · Vidhi Jain · Ruslan Partsey · Ram Ramrakhya · Andrew Szot · Austin Wang · Tsung-Yen Yang · Aaron Edsinger · Charles Kemp · Binit Shah · Zsolt Kira · Dhruv Batra · Roozbeh Mottaghi · Yonatan Bisk · Chris Paxton -
2020 : Spotlight Talk: Jain »
Vidhi Jain -
2020 Workshop: Workshop on Dataset Curation and Security »
Nathalie Baracaldo · Yonatan Bisk · Avrim Blum · Michael Curry · John Dickerson · Micah Goldblum · Tom Goldstein · Bo Li · Avi Schwarzschild -
2019 Poster: Defending Against Neural Fake News »
Rowan Zellers · Ari Holtzman · Hannah Rashkin · Yonatan Bisk · Ali Farhadi · Franziska Roesner · Yejin Choi -
2018 Workshop: Wordplay: Reinforcement and Language Learning in Text-based Games »
Adam Trischler · Angeliki Lazaridou · Yonatan Bisk · Wendy Tay · Nate Kushman · Marc-Alexandre Côté · Alessandro Sordoni · Daniel Ricks · Tom Zahavy · Hal Daumé III