Timezone: »
Can we develop visually grounded dialog agents that can efficiently adapt to new tasks without forgetting how to talk to people? Such agents could leverage a larger variety of existing data to generalize to a new task, minimizing expensive data collection and annotation. In this work, we study a setting we call "Dialog without Dialog", which requires agents to develop visually grounded dialog models that can adapt to new tasks without language level supervision. By factorizing intention and language, our model minimizes linguistic drift after fine-tuning for new tasks. We present qualitative results, automated metrics, and human studies that all show our model can adapt to new tasks and maintain language quality. Baselines either fail to perform well at new tasks or experience language drift, becoming unintelligible to humans. Code has been made available at: https://github.com/mcogswell/dialogwithoutdialog.
Author Information
Michael Cogswell (SRI International)
Jiasen Lu (Allen Institute of Artificial Intelligence)
Rishabh Jain (Georgia Tech)
Stefan Lee (Oregon State University)
Devi Parikh (Georgia Tech / Facebook AI Research (FAIR))
Dhruv Batra (Georgia Tech / Facebook AI Research (FAIR))
More from the Same Authors
-
2020 Poster: Language-Conditioned Imitation Learning for Robot Manipulation Tasks »
Simon Stepputtis · Joseph Campbell · Mariano Phielipp · Stefan Lee · Chitta Baral · Heni Ben Amor -
2020 Spotlight: Language-Conditioned Imitation Learning for Robot Manipulation Tasks »
Simon Stepputtis · Joseph Campbell · Mariano Phielipp · Stefan Lee · Chitta Baral · Heni Ben Amor -
2019 Poster: Cross-channel Communication Networks »
Jianwei Yang · Zhile Ren · Chuang Gan · Hongyuan Zhu · Devi Parikh -
2019 Poster: ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks »
Jiasen Lu · Dhruv Batra · Devi Parikh · Stefan Lee -
2019 Poster: RUBi: Reducing Unimodal Biases for Visual Question Answering »
Remi Cadene · Corentin Dancette · Hedi Ben younes · Matthieu Cord · Devi Parikh -
2019 Poster: Chasing Ghosts: Instruction Following as Bayesian State Tracking »
Peter Anderson · Ayush Shrivastava · Devi Parikh · Dhruv Batra · Stefan Lee -
2018 Workshop: Visually grounded interaction and language »
Florian Strub · Harm de Vries · Erik Wijmans · Samyak Datta · Ethan Perez · Mateusz Malinowski · Stefan Lee · Peter Anderson · Aaron Courville · Jeremie MARY · Dhruv Batra · Devi Parikh · Olivier Pietquin · Chiori HORI · Tim Marks · Anoop Cherian -
2017 Workshop: Visually grounded interaction and language »
Florian Strub · Harm de Vries · Abhishek Das · Satwik Kottur · Stefan Lee · Mateusz Malinowski · Olivier Pietquin · Devi Parikh · Dhruv Batra · Aaron Courville · Jeremie Mary -
2017 Poster: Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model »
Jiasen Lu · Anitha Kannan · Jianwei Yang · Devi Parikh · Dhruv Batra -
2016 Poster: Hierarchical Question-Image Co-Attention for Visual Question Answering »
Jiasen Lu · Jianwei Yang · Dhruv Batra · Devi Parikh -
2016 Poster: Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles »
Stefan Lee · Senthil Purushwalkam · Michael Cogswell · Viresh Ranjan · David Crandall · Dhruv Batra -
2015 Poster: SubmodBoxes: Near-Optimal Search for a Set of Diverse Object Proposals »
Qing Sun · Dhruv Batra -
2014 Workshop: Discrete Optimization in Machine Learning »
Jeffrey A Bilmes · Andreas Krause · Stefanie Jegelka · S Thomas McCormick · Sebastian Nowozin · Yaron Singer · Dhruv Batra · Volkan Cevher -
2014 Poster: Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets »
Adarsh Prasad · Stefanie Jegelka · Dhruv Batra -
2014 Spotlight: Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets »
Adarsh Prasad · Stefanie Jegelka · Dhruv Batra -
2012 Poster: Multiple Choice Learning: Learning to Produce Multiple Structured Outputs »
Abner Guzmán-Rivera · Dhruv Batra · Pushmeet Kohli -
2011 Workshop: Beyond Mahalanobis: Supervised Large-Scale Learning of Similarity »
Greg Shakhnarovich · Dhruv Batra · Brian Kulis · Kilian Q Weinberger -
2011 Poster: Understanding the Intrinsic Memorability of Images »
Phillip Isola · Devi Parikh · Antonio Torralba · Aude Oliva