Timezone: »
A growing number of researchers in computer vision have started to explore how language accompanying images and video can be used to aid interpretation and retrieval, as well as train object and activity recognizers. Simultaneously, an increasing number of computational linguists have begun to investigate how visual information can be used to aid language learning and interpretation, and to ground the meaning of words and sentences in perception. However, there has been very little direct interaction between researchers in these two distinct disciplines. Consequently, researchers in each area have a quite limited understanding of the methods in the other area, and do not optimally exploit the latest ideas and techniques from both disciplines when developing systems that integrate language and vision. Therefore, we believe the time is particularly opportune for a workshop that brings together researchers in both computer vision and natural-language processing (NLP) to discuss issues and ideas in developing systems that combine language and vision.
Traditional machine learning for both computer vision and NLP requires manually annotating images, video, text, or speech with detailed labels, parse-trees, segmentations, etc. Methods that integrate language and vision hold the promise of greatly reducing such manual supervision by using naturally co-occurring text and images/video to mutually supervise each other.
There are also a wide range of important real-world applications that require integrating vision and language, including but not limited to: image and video retrieval, human-robot interaction, medical image processing, human-computer interaction in virtual worlds, and computer graphics generation.
More than any other major conference, NIPS attracts a fair number of researchers in both computer vision and computational linguistics. Therefore, we believe it is the best venue for holding a workshop that brings these two communities together for the very first time to interact, collaborate, and discuss issues and future directions in integrating language and vision.
Author Information
Raymond Mooney (University of Texas at Austin)
Trevor Darrell (UC Berkeley)
Kate Saenko (UMass Lowell)
More from the Same Authors
-
2022 : Zero-shot Video Moment Retrieval With Off-the-Shelf Models »
Anuj Diwan · Puyuan Peng · Raymond Mooney -
2022 : Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks »
Albert Yu · Raymond Mooney -
2022 : Language-guided Task Adaptation for Imitation Learning »
Prasoon Goyal · Raymond Mooney · Scott Niekum -
2022 : Studying Bias in GANs through the Lens of Race »
Vongani Maluleke · Neerja Thakkar · Tim Brooks · Ethan Weber · Trevor Darrell · Alexei Efros · Angjoo Kanazawa · Devin Guillory -
2020 Poster: Auxiliary Task Reweighting for Minimum-data Learning »
Baifeng Shi · Judy Hoffman · Kate Saenko · Trevor Darrell · Huijuan Xu -
2020 Poster: Fighting Copycat Agents in Behavioral Cloning from Observation Histories »
Chuan Wen · Jierui Lin · Trevor Darrell · Dinesh Jayaraman · Yang Gao -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · Raphaël Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 : Oral Presentations »
Janith Petangoda · Sergio Pascual-Diaz · Jordi Grau-Moya · Raphaël Marinier · Olivier Pietquin · Alexei Efros · Phillip Isola · Trevor Darrell · Christopher Lu · Deepak Pathak · Johan Ferret -
2019 Workshop: AI for Humanitarian Assistance and Disaster Response »
Ritwik Gupta · Robin Murphy · Trevor Darrell · Eric Heim · Zhangyang Wang · Bryce Goodman · Piotr Biliński -
2019 Poster: Self-Critical Reasoning for Robust Visual Question Answering »
Jialin Wu · Raymond Mooney -
2019 Spotlight: Self-Critical Reasoning for Robust Visual Question Answering »
Jialin Wu · Raymond Mooney -
2019 Poster: Compositional Plan Vectors »
Coline Devin · Daniel Geng · Pieter Abbeel · Trevor Darrell · Sergey Levine -
2019 Poster: Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity »
Deepak Pathak · Christopher Lu · Trevor Darrell · Phillip Isola · Alexei Efros -
2019 Spotlight: Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity »
Deepak Pathak · Christopher Lu · Trevor Darrell · Phillip Isola · Alexei Efros -
2018 : Learning to Understand Natural Language Instructions through Human-Robot Dialog »
Raymond Mooney -
2018 Poster: Speaker-Follower Models for Vision-and-Language Navigation »
Daniel Fried · Ronghang Hu · Volkan Cirik · Anna Rohrbach · Jacob Andreas · Louis-Philippe Morency · Taylor Berg-Kirkpatrick · Kate Saenko · Dan Klein · Trevor Darrell -
2017 : Invited Talk 7 »
Trevor Darrell -
2017 : Adaptive Deep Learning for Perception, Action, and Explanation, Trevor Darrell (UC Berkeley) »
Trevor Darrell -
2017 : Panel Discussion »
Felix Hill · Olivier Pietquin · Jack Gallant · Raymond Mooney · Sanja Fidler · Chen Yu · Devi Parikh -
2017 : Visually Grounded Language: Past, Present, and Future... »
Raymond Mooney -
2017 Poster: Toward Multimodal Image-to-Image Translation »
Jun-Yan Zhu · Richard Zhang · Deepak Pathak · Trevor Darrell · Alexei Efros · Oliver Wang · Eli Shechtman -
2016 : Invited Talk: Learning Adaptive Driving Models from Large-scale Video Datasets (Fisher Yu, Huazhe Xu, Dequan Wang, and Trevor Darrell, Berkeley) »
Trevor Darrell -
2016 Workshop: Machine Learning for Intelligent Transportation Systems »
Li Erran Li · Trevor Darrell -
2015 : Intro and Adapting Deep Networks Across Domains, Modalities, and Tasks »
Trevor Darrell -
2015 : Generating Natural-Language Video Descriptions using LSTM Recurrent Neural Networks »
Raymond Mooney -
2014 Poster: Do Convnets Learn Correspondence? »
Jonathan L Long · Ning Zhang · Trevor Darrell -
2014 Poster: LSDA: Large Scale Detection through Adaptation »
Judy Hoffman · Sergio Guadarrama · Eric Tzeng · Ronghang Hu · Jeff Donahue · Ross Girshick · Trevor Darrell · Kate Saenko -
2014 Poster: Weakly-supervised Discovery of Visual Pattern Configurations »
Hyun Oh Song · Yong Jae Lee · Stefanie Jegelka · Trevor Darrell -
2013 Poster: Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies »
Yangqing Jia · Joshua T Abbott · Joseph L Austerweil · Tom Griffiths · Trevor Darrell -
2012 Poster: Learning with Recursive Perceptual Representations »
Oriol Vinyals · Yangqing Jia · Li Deng · Trevor Darrell -
2012 Poster: Timely Object Recognition »
Sergey K Karayev · Tobi Baumgartner · Mario Fritz · Trevor Darrell -
2011 Poster: Heavy-tailed Distances for Gradient Based Image Descriptors »
Yangqing Jia · Trevor Darrell -
2010 Poster: Factorized Latent Spaces with Structured Sparsity »
Yangqing Jia · Mathieu Salzmann · Trevor Darrell -
2010 Poster: Size Matters: Metric Visual Search Constraints from Monocular Metadata »
Mario J Fritz · Kate Saenko · Trevor Darrell -
2009 Poster: Learning to Hash with Binary Reconstructive Embeddings »
Brian Kulis · Trevor Darrell -
2009 Spotlight: Learning to Hash with Binary Reconstructive Embeddings »
Brian Kulis · Trevor Darrell -
2009 Poster: An Additive Latent Feature Model for Transparent Object Recognition »
Mario J Fritz · Michael J Black · Gary R Bradski · Trevor Darrell -
2009 Poster: Filtering Abstract Senses From Image Search Results »
Kate Saenko · Trevor Darrell -
2009 Oral: An Additive Latent Feature Model for Transparent Object Recognition »
Mario J Fritz · Michael J Black · Gary R Bradski · Trevor Darrell -
2008 Poster: Unsupervised Learning of Visual Sense Models for Polysemous Words »
Kate Saenko · Trevor Darrell -
2008 Spotlight: Unsupervised Learning of Visual Sense Models for Polysemous Words »
Kate Saenko · Trevor Darrell -
2006 Poster: Approximate Correspondences in High Dimensions »
Kristen Grauman · Trevor Darrell -
2006 Spotlight: Approximate Correspondences in High Dimensions »
Kristen Grauman · Trevor Darrell