Timezone: »
We address a practical yet challenging problem of training robot agents to navigate in an environment following a path described by some language instructions. The instructions often contain descriptions of objects in the environment. To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects. However, enabling a robot to build a map that well represents the environment is extremely challenging as the environment often involves diverse objects with various attributes. In this paper, we propose a multi-granularity map, which contains both object fine-grained details (\eg, color, texture) and semantic classes, to represent objects more comprehensively. Moreover, we propose a weakly-supervised auxiliary task, which requires the agent to localize instruction-relevant objects on the map. Through this task, the agent not only learns to localize the instruction-relevant objects for navigation but also is encouraged to learn a better map representation that reveals object information. We then feed the learned map and instruction to a waypoint predictor to determine the next navigation goal. Experimental results show our method outperforms the state-of-the-art by 4.0% and 4.6% w.r.t. success rate both in seen and unseen environments, respectively on VLN-CE dataset. The code is available at https://github.com/PeihaoChen/WS-MGMap.
Author Information
Peihao Chen (South China University of Technology)
Dongyu Ji (South China University of Technology)
Kunyang Lin (South China University of Technology)
Runhao Zeng (South China University of Technology)
Thomas Li (AIIT, Peking University)
Mingkui Tan (South China University of Technology)
Chuang Gan (UMass Amherst/ MIT-IBM Watson AI Lab)
More from the Same Authors
-
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2021 : STAR: A Benchmark for Situated Reasoning in Real-World Videos »
Bo Wu · Shoubin Yu · Zhenfang Chen · Josh Tenenbaum · Chuang Gan -
2022 Poster: Learning Physical Dynamics with Subequivariant Graph Neural Networks »
Jiaqi Han · Wenbing Huang · Hengbo Ma · Jiachen Li · Josh Tenenbaum · Chuang Gan -
2022 Poster: SNAKE: Shape-aware Neural 3D Keypoint Field »
Chengliang Zhong · Peixing You · Xiaoxue Chen · Hao Zhao · Fuchun Sun · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 : Planning with Large Language Models for Code Generation »
Shun Zhang · Zhenfang Chen · Yikang Shen · Mingyu Ding · Josh Tenenbaum · Chuang Gan -
2022 : Hyper-Decision Transformer for Efficient Online Policy Adaptation »
Mengdi Xu · Yuchen Lu · Yikang Shen · Shun Zhang · DING ZHAO · Chuang Gan -
2022 : VARIATIONAL REPARAMETRIZED POLICY LEARNING WITH DIFFERENTIABLE PHYSICS »
Zhiao Huang · Litian Liang · Zhan Ling · Xuanlin Li · Chuang Gan · Hao Su -
2022 Spotlight: Lightning Talks 6A-3 »
Junyu Xie · Chengliang Zhong · Ali Ayub · Sravanti Addepalli · Harsh Rangwani · Jiapeng Tang · Yuchen Rao · Zhiying Jiang · Yuqi Wang · Xingzhe He · Gene Chou · Ilya Chugunov · Samyak Jain · Yuntao Chen · Weidi Xie · Sumukh K Aithal · Carter Fendley · Lev Markhasin · Yiqin Dai · Peixing You · Bastian Wandt · Yinyu Nie · Helge Rhodin · Felix Heide · Ji Xin · Angela Dai · Andrew Zisserman · Bi Wang · Xiaoxue Chen · Mayank Mishra · ZHAO-XIANG ZHANG · Venkatesh Babu R · Justus Thies · Ming Li · Hao Zhao · Venkatesh Babu R · Jimmy Lin · Fuchun Sun · Matthias Niessner · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 Spotlight: SNAKE: Shape-aware Neural 3D Keypoint Field »
Chengliang Zhong · Peixing You · Xiaoxue Chen · Hao Zhao · Fuchun Sun · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 Spotlight: Lightning Talks 5A-3 »
Minting Pan · Xiang Chen · Wenhan Huang · Can Chang · Zhecheng Yuan · Jianzhun Shao · Yushi Cao · Peihao Chen · Ke Xue · Zhengrong Xue · Zhiqiang Lou · Xiangming Zhu · Lei Li · Zhiming Li · Kai Li · Jiacheng Xu · Dongyu Ji · Ni Mu · Kun Shao · Tianpei Yang · Kunyang Lin · Ningyu Zhang · Yunbo Wang · Lei Yuan · Bo Yuan · Hongchang Zhang · Jiajun Wu · Tianze Zhou · Xueqian Wang · Ling Pan · Yuhang Jiang · Xiaokang Yang · Xiaozhuan Liang · Hao Zhang · Weiwen Hu · Miqing Li · YAN ZHENG · Matthew Taylor · Huazhe Xu · Shumin Deng · Chao Qian · YI WU · Shuncheng He · Wenbing Huang · Chuanqi Tan · Zongzhang Zhang · Yang Gao · Jun Luo · Yi Li · Xiangyang Ji · Thomas Li · Mingkui Tan · Fei Huang · Yang Yu · Huazhe Xu · Dongge Wang · Jianye Hao · Chuang Gan · Yang Liu · Luo Si · Hangyu Mao · Huajun Chen · Jianye Hao · Jun Wang · Xiaotie Deng -
2022 Spotlight: Learning Active Camera for Multi-Object Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Weiwen Hu · Wenbing Huang · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Spotlight: Lightning Talks 4B-3 »
Zicheng Zhang · Mancheng Meng · Antoine Guedon · Yue Wu · Wei Mao · Zaiyu Huang · Peihao Chen · Shizhe Chen · yongwei chen · Keqiang Sun · Yi Zhu · chen rui · Hanhui Li · Dongyu Ji · Ziyan Wu · miaomiao Liu · Pascal Monasse · Yu Deng · Shangzhe Wu · Pierre-Louis Guhur · Jiaolong Yang · Kunyang Lin · Makarand Tapaswi · Zhaoyang Huang · Terrence Chen · Jiabao Lei · Jianzhuang Liu · Vincent Lepetit · Zhenyu Xie · Richard I Hartley · Dinggang Shen · Xiaodan Liang · Runhao Zeng · Cordelia Schmid · Michael Kampffmeyer · Mathieu Salzmann · Ning Zhang · Fangyun Wei · Yabin Zhang · Fan Yang · Qifeng Chen · Wei Ke · Quan Wang · Thomas Li · qingling Cai · Kui Jia · Ivan Laptev · Mingkui Tan · Xin Tong · Hongsheng Li · Xiaodan Liang · Chuang Gan -
2022 Spotlight: Learning Physical Dynamics with Subequivariant Graph Neural Networks »
Jiaqi Han · Wenbing Huang · Hengbo Ma · Jiachen Li · Josh Tenenbaum · Chuang Gan -
2022 Spotlight: Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Runhao Zeng · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Spotlight: Lightning Talks 4B-1 »
Alexandra Senderovich · Zhijie Deng · Navid Ansari · Xuefei Ning · Yasmin Salehi · Xiang Huang · Chenyang Wu · Kelsey Allen · Jiaqi Han · Nikita Balagansky · Tatiana Lopez-Guevara · Tianci Li · Zhanhong Ye · Zixuan Zhou · Feng Zhou · Ekaterina Bulatova · Daniil Gavrilov · Wenbing Huang · Dennis Giannacopoulos · Hans-peter Seidel · Anton Obukhov · Kimberly Stachenfeld · Hongsheng Liu · Jun Zhu · Junbo Zhao · Hengbo Ma · Nima Vahidi Ferdowsi · Zongzhang Zhang · Vahid Babaei · Jiachen Li · Alvaro Sanchez Gonzalez · Yang Yu · Shi Ji · Maxim Rakhuba · Tianchen Zhao · Yiping Deng · Peter Battaglia · Josh Tenenbaum · Zidong Wang · Chuang Gan · Changcheng Tang · Jessica Hamrick · Kang Yang · Tobias Pfaff · Yang Li · Shuang Liang · Min Wang · Huazhong Yang · Haotian CHU · Yu Wang · Fan Yu · Bei Hua · Lei Chen · Bin Dong -
2022 Poster: 3D Concept Grounding on Neural Fields »
Yining Hong · Yilun Du · Chunru Lin · Josh Tenenbaum · Chuang Gan -
2022 Poster: Learning Active Camera for Multi-Object Navigation »
Peihao Chen · Dongyu Ji · Kunyang Lin · Weiwen Hu · Wenbing Huang · Thomas Li · Mingkui Tan · Chuang Gan -
2022 Poster: Learning Neural Acoustic Fields »
Andrew Luo · Yilun Du · Michael Tarr · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2022 Poster: On-Device Training Under 256KB Memory »
Ji Lin · Ligeng Zhu · Wei-Ming Chen · Wei-Chen Wang · Chuang Gan · Song Han -
2021 Poster: Memory-efficient Patch-based Inference for Tiny Deep Learning »
Ji Lin · Wei-Ming Chen · Han Cai · Chuang Gan · Song Han -
2021 Poster: Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language »
Mingyu Ding · Zhenfang Chen · Tao Du · Ping Luo · Josh Tenenbaum · Chuang Gan -
2021 Poster: PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning »
Yining Hong · Li Yi · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2021 Poster: Debiased Visual Question Answering from Feature and Sample Perspectives »
Zhiquan Wen · Guanghui Xu · Mingkui Tan · Qingyao Wu · Qi Wu -
2021 Poster: When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning? »
Lijie Fan · Sijia Liu · Pin-Yu Chen · Gaoyuan Zhang · Chuang Gan -
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2020 Poster: MCUNet: Tiny Deep Learning on IoT Devices »
Ji Lin · Wei-Ming Chen · Yujun Lin · john cohn · Chuang Gan · Song Han -
2020 Spotlight: MCUNet: Tiny Deep Learning on IoT Devices »
Ji Lin · Wei-Ming Chen · Yujun Lin · john cohn · Chuang Gan · Song Han -
2020 Poster: TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning »
Han Cai · Chuang Gan · Ligeng Zhu · Song Han -
2020 : Neurosymbolic Visual Reasoning »
Chuang Gan -
2019 Poster: Cross-channel Communication Networks »
Jianwei Yang · Zhile Ren · Chuang Gan · Hongyuan Zhu · Devi Parikh -
2019 Poster: Visual Concept-Metaconcept Learning »
Chi Han · Jiayuan Mao · Chuang Gan · Josh Tenenbaum · Jiajun Wu -
2019 Poster: NAT: Neural Architecture Transformer for Accurate and Compact Architectures »
Yong Guo · Yin Zheng · Mingkui Tan · Qi Chen · Jian Chen · Peilin Zhao · Junzhou Huang -
2019 Poster: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement »
Chao Yang · Xiaojian Ma · Wenbing Huang · Fuchun Sun · Huaping Liu · Junzhou Huang · Chuang Gan -
2019 Spotlight: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement »
Chao Yang · Xiaojian Ma · Wenbing Huang · Fuchun Sun · Huaping Liu · Junzhou Huang · Chuang Gan -
2019 Poster: Multi-marginal Wasserstein GAN »
Jiezhang Cao · Langyuan Mo · Yifan Zhang · Kui Jia · Chunhua Shen · Mingkui Tan -
2018 Poster: Discrimination-aware Channel Pruning for Deep Neural Networks »
Zhuangwei Zhuang · Mingkui Tan · Bohan Zhuang · Jing Liu · Yong Guo · Qingyao Wu · Junzhou Huang · Jinhui Zhu -
2018 Poster: Weakly Supervised Dense Event Captioning in Videos »
Xin Wang · Wenbing Huang · Chuang Gan · Jingdong Wang · Wenwu Zhu · Junzhou Huang -
2018 Poster: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding »
Kexin Yi · Jiajun Wu · Chuang Gan · Antonio Torralba · Pushmeet Kohli · Josh Tenenbaum -
2018 Spotlight: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding »
Kexin Yi · Jiajun Wu · Chuang Gan · Antonio Torralba · Pushmeet Kohli · Josh Tenenbaum