Timezone: »
Navigation guided by natural language instructions presents a challenging reasoning problem for instruction followers. Natural language instructions typically identify only a few high-level decisions and landmarks rather than complete low-level motor behaviors; much of the missing information must be inferred based on perceptual context. In machine learning settings, this is doubly challenging: it is difficult to collect enough annotated data to enable learning of this reasoning process from scratch, and also difficult to implement the reasoning process using generic sequence models. Here we describe an approach to vision-and-language navigation that addresses both these issues with an embedded speaker model. We use this speaker model to (1) synthesize new instructions for data augmentation and to (2) implement pragmatic reasoning, which evaluates how well candidate action sequences explain an instruction. Both steps are supported by a panoramic action space that reflects the granularity of human-generated instructions. Experiments show that all three components of this approach---speaker-driven data augmentation, pragmatic reasoning and panoramic action space---dramatically improve the performance of a baseline instruction follower, more than doubling the success rate over the best existing approach on a standard benchmark.
Author Information
Daniel Fried (UC Berkeley)
Ronghang Hu (University of California, Berkeley)
Volkan Cirik (Carnegie Mellon University)
Anna Rohrbach (UC Berkeley)
Jacob Andreas (UC Berkeley)
LP Morency (Carnegie Mellon University)
Taylor Berg-Kirkpatrick (Carnegie Mellon University)
Kate Saenko (Boston University)
Dan Klein (UC Berkeley)
Trevor Darrell (UC Berkeley)
More from the Same Authors
-
2020 Poster: Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment »
Ben Usman · Avneesh Sud · Nick Dufour · Kate Saenko -
2020 Poster: Uncertainty-Aware Learning for Zero-Shot Semantic Segmentation »
Ping Hu · Stan Sclaroff · Kate Saenko -
2020 Poster: Universal Domain Adaptation through Self Supervision »
Kuniaki Saito · Donghyun Kim · Stan Sclaroff · Kate Saenko -
2020 Poster: Auxiliary Task Reweighting for Minimum-data Learning »
Baifeng Shi · Judy Hoffman · Kate Saenko · Trevor Darrell · Huijuan Xu -
2020 Poster: AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning »
Ximeng Sun · Rameswar Panda · Rogerio Feris · Kate Saenko -
2020 Poster: Neural Methods for Point-wise Dependency Estimation »
Yao-Hung Hubert Tsai · Han Zhao · Makoto Yamada · Louis-Philippe Morency · Russ Salakhutdinov -
2020 Poster: Fighting Copycat Agents in Behavioral Cloning from Observation Histories »
Chuan Wen · Jierui Lin · Trevor Darrell · Dinesh Jayaraman · Yang Gao -
2020 Spotlight: Neural Methods for Point-wise Dependency Estimation »
Yao-Hung Hubert Tsai · Han Zhao · Makoto Yamada · Louis-Philippe Morency · Russ Salakhutdinov -
2019 Workshop: AI for Humanitarian Assistance and Disaster Response »
Ritwik Gupta · Robin Murphy · Trevor Darrell · Eric Heim · Zhangyang Wang · Bryce Goodman · Piotr Biliński -
2019 Poster: Deep Gamblers: Learning to Abstain with Portfolio Theory »
Liu Ziyin · Zhikang Wang · Paul Pu Liang · Russ Salakhutdinov · Louis-Philippe Morency · Masahito Ueda -
2019 Poster: Compositional Plan Vectors »
Coline Devin · Daniel Geng · Pieter Abbeel · Trevor Darrell · Sergey Levine -
2019 Poster: Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity »
Deepak Pathak · Christopher Lu · Trevor Darrell · Phillip Isola · Alexei Efros -
2019 Spotlight: Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity »
Deepak Pathak · Christopher Lu · Trevor Darrell · Phillip Isola · Alexei Efros -
2019 Poster: Adversarial Self-Defense for Cycle-Consistent GANs »
Dina Bashkirova · Ben Usman · Kate Saenko -
2018 Poster: Unsupervised Text Style Transfer using Language Models as Discriminators »
Zichao Yang · Zhiting Hu · Chris Dyer · Eric Xing · Taylor Berg-Kirkpatrick -
2017 Poster: Toward Multimodal Image-to-Image Translation »
Jun-Yan Zhu · Richard Zhang · Deepak Pathak · Trevor Darrell · Alexei Efros · Oliver Wang · Eli Shechtman -
2016 Workshop: Machine Learning for Intelligent Transportation Systems »
Li Erran Li · Trevor Darrell -
2015 Workshop: Transfer and Multi-Task Learning: Trends and New Perspectives »
Anastasia Pentina · Christoph Lampert · Sinno Jialin Pan · Mingsheng Long · Judy Hoffman · Baochen Sun · Kate Saenko -
2015 Workshop: Multimodal Machine Learning »
Louis-Philippe Morency · Tadas Baltrusaitis · Aaron Courville · Kyunghyun Cho -
2015 Poster: On the Accuracy of Self-Normalized Log-Linear Models »
Jacob Andreas · Maxim Rabinovich · Michael Jordan · Dan Klein -
2014 Poster: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2014 Demonstration: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2014 Spotlight: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2014 Poster: Do Convnets Learn Correspondence? »
Jonathan L Long · Ning Zhang · Trevor Darrell -
2014 Poster: LSDA: Large Scale Detection through Adaptation »
Judy Hoffman · Sergio Guadarrama · Eric Tzeng · Ronghang Hu · Jeff Donahue · Ross Girshick · Trevor Darrell · Kate Saenko -
2014 Poster: Weakly-supervised Discovery of Visual Pattern Configurations »
Hyun Oh Song · Yong Jae Lee · Stefanie Jegelka · Trevor Darrell -
2013 Poster: Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies »
Yangqing Jia · Joshua T Abbott · Joseph L Austerweil · Tom Griffiths · Trevor Darrell -
2012 Poster: Learning with Recursive Perceptual Representations »
Oriol Vinyals · Yangqing Jia · Li Deng · Trevor Darrell -
2012 Poster: Timely Object Recognition »
Sergey K Karayev · Tobi Baumgartner · Mario Fritz · Trevor Darrell -
2011 Workshop: Integrating Language and Vision »
Raymond Mooney · Trevor Darrell · Kate Saenko -
2011 Poster: Heavy-tailed Distances for Gradient Based Image Descriptors »
Yangqing Jia · Trevor Darrell -
2010 Poster: Factorized Latent Spaces with Structured Sparsity »
Yangqing Jia · Mathieu Salzmann · Trevor Darrell -
2010 Poster: Size Matters: Metric Visual Search Constraints from Monocular Metadata »
Mario J Fritz · Kate Saenko · Trevor Darrell -
2009 Poster: Learning to Hash with Binary Reconstructive Embeddings »
Brian Kulis · Trevor Darrell -
2009 Spotlight: Learning to Hash with Binary Reconstructive Embeddings »
Brian Kulis · Trevor Darrell -
2009 Poster: An Additive Latent Feature Model for Transparent Object Recognition »
Mario J Fritz · Michael J Black · Gary R Bradski · Trevor Darrell -
2009 Poster: Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs »
Alexandre Bouchard-Côté · Slav Petrov · Dan Klein -
2009 Poster: Filtering Abstract Senses From Image Search Results »
Kate Saenko · Trevor Darrell -
2009 Spotlight: Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs »
Alexandre Bouchard-Côté · Slav Petrov · Dan Klein -
2009 Oral: An Additive Latent Feature Model for Transparent Object Recognition »
Mario J Fritz · Michael J Black · Gary R Bradski · Trevor Darrell -
2008 Workshop: Speech and Language: Unsupervised Latent-Variable Models »
Slav Petrov · Aria Haghighi · Percy Liang · Dan Klein -
2008 Poster: Efficient Inference in Phylogenetic InDel Trees »
Alexandre Bouchard-Côté · Michael Jordan · Dan Klein -
2008 Poster: Unsupervised Learning of Visual Sense Models for Polysemous Words »
Kate Saenko · Trevor Darrell -
2008 Spotlight: Unsupervised Learning of Visual Sense Models for Polysemous Words »
Kate Saenko · Trevor Darrell -
2008 Spotlight: Efficient Inference in Phylogenetic InDel Trees »
Alexandre Bouchard-Côté · Michael Jordan · Dan Klein -
2007 Poster: Agreement-Based Learning »
Percy Liang · Dan Klein · Michael Jordan -
2007 Spotlight: Agreement-Based Learning »
Percy Liang · Dan Klein · Michael Jordan -
2007 Session: Spotlights »
Dan Klein -
2007 Session: Spotlights »
Dan Klein -
2007 Spotlight: Discriminative Log-Linear Grammars with Latent Variables »
Slav Petrov · Dan Klein -
2007 Poster: Discriminative Log-Linear Grammars with Latent Variables »
Slav Petrov · Dan Klein -
2007 Poster: A Probabilistic Approach to Language Change »
Alexandre Bouchard-Côté · Percy Liang · Tom Griffiths · Dan Klein -
2006 Poster: Approximate Correspondences in High Dimensions »
Kristen Grauman · Trevor Darrell -
2006 Spotlight: Approximate Correspondences in High Dimensions »
Kristen Grauman · Trevor Darrell