Timezone: »
Poster
Unsupervised Learning of Spoken Language with Visual Context
David Harwath · Antonio Torralba · James Glass
Humans learn to speak before they can read or write, so why can’t computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.
Author Information
David Harwath (MIT CSAIL)
Antonio Torralba (MIT CSAIL)
James Glass (MIT CSAIL)
More from the Same Authors
-
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2021 Spotlight: Learning to Compose Visual Relations »
Nan Liu · Shuang Li · Yilun Du · Josh Tenenbaum · Antonio Torralba -
2021 Spotlight: Learning to See by Looking at Noise »
Manel Baradad Jurjo · Jonas Wulff · Tongzhou Wang · Phillip Isola · Antonio Torralba -
2021 Spotlight: Measuring Generalization with Optimal Transport »
Ching-Yao Chuang · Youssef Mroueh · Kristjan Greenewald · Antonio Torralba · Stefanie Jegelka -
2021 : 3D Neural Scene Representations for Visuomotor Control »
Yunzhu Li · Shuang Li · Vincent Sitzmann · Pulkit Agrawal · Antonio Torralba -
2021 : 3D Neural Scene Representations for Visuomotor Control »
Yunzhu Li · Shuang Li · Vincent Sitzmann · Pulkit Agrawal · Antonio Torralba -
2021 : 3D Neural Scene Representations for Visuomotor Control »
Yunzhu Li · Shuang Li · Vincent Sitzmann · Pulkit Agrawal · Antonio Torralba -
2021 Poster: Learning to Compose Visual Relations »
Nan Liu · Shuang Li · Yilun Du · Josh Tenenbaum · Antonio Torralba -
2021 Poster: EditGAN: High-Precision Semantic Image Editing »
Huan Ling · Karsten Kreis · Daiqing Li · Seung Wook Kim · Antonio Torralba · Sanja Fidler -
2021 Poster: Learning to See by Looking at Noise »
Manel Baradad Jurjo · Jonas Wulff · Tongzhou Wang · Phillip Isola · Antonio Torralba -
2021 Poster: PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning »
Yining Hong · Li Yi · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2021 Poster: Editing a classifier by rewriting its prediction rules »
Shibani Santurkar · Dimitris Tsipras · Mahalaxmi Elango · David Bau · Antonio Torralba · Aleksander Madry -
2021 Poster: Measuring Generalization with Optimal Transport »
Ching-Yao Chuang · Youssef Mroueh · Kristjan Greenewald · Antonio Torralba · Stefanie Jegelka -
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2020 Poster: Causal Discovery in Physical Systems from Videos »
Yunzhu Li · Antonio Torralba · Anima Anandkumar · Dieter Fox · Animesh Garg -
2018 Poster: Visual Object Networks: Image Generation with Disentangled 3D Representations »
Jun-Yan Zhu · Zhoutong Zhang · Chengkai Zhang · Jiajun Wu · Antonio Torralba · Josh Tenenbaum · Bill Freeman -
2017 Poster: Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data »
Wei-Ning Hsu · Yu Zhang · James Glass -
2016 : Invited Talk - Learning to see objects by listening »
Antonio Torralba -
2014 Poster: Learning Deep Features for Scene Recognition using Places Database »
Bolei Zhou · Agata Lapedriza · Jianxiong Xiao · Antonio Torralba · Aude Oliva -
2014 Spotlight: Learning Deep Features for Scene Recognition using Places Database »
Bolei Zhou · Agata Lapedriza · Jianxiong Xiao · Antonio Torralba · Aude Oliva -
2012 Poster: Modeling the Forgetting Process using Image Regions »
Aditya Khosla · Jianxiong Xiao · Antonio Torralba · Aude Oliva -
2012 Poster: Localizing 3D cuboids in single-view images »
Jianxiong Xiao · Bryan C Russell · Antonio Torralba -
2011 Poster: Learning to Learn with Compound HD Models »
Russ Salakhutdinov · Josh Tenenbaum · Antonio Torralba -
2011 Poster: Understanding the Intrinsic Memorability of Images »
Phillip Isola · Devi Parikh · Antonio Torralba · Aude Oliva -
2011 Spotlight: Learning to Learn with Compound HD Models »
Russ Salakhutdinov · Josh Tenenbaum · Antonio Torralba -
2011 Poster: Transfer Learning by Borrowing Examples »
Joseph Lim · Russ Salakhutdinov · Antonio Torralba -
2009 Poster: Unsupervised Detection of Regions of Interest Using Iterative Link Analysis »
Gunhee Kim · Antonio Torralba -
2009 Session: Oral session 7: Vision and Inference »
Antonio Torralba -
2009 Poster: Semi-Supervised Learning in Gigantic Image Collections »
Rob Fergus · Yair Weiss · Antonio Torralba -
2009 Oral: Semi-Supervised Learning in Gigantic Image Collections »
Rob Fergus · Yair Weiss · Antonio Torralba -
2009 Poster: Nonparametric Bayesian Texture Learning and Synthesis »
Leo Zhu · Yuanhao Chen · Bill Freeman · Antonio Torralba -
2009 Tutorial: Understanding Visual Scenes »
Antonio Torralba -
2008 Poster: Spectral Hashing »
Yair Weiss · Antonio Torralba · Rob Fergus -
2007 Spotlight: Object Recognition by Scene Alignment »
Bryan C Russell · Antonio Torralba · Ce Liu · Rob Fergus · William Freeman -
2007 Poster: Object Recognition by Scene Alignment »
Bryan C Russell · Antonio Torralba · Ce Liu · Rob Fergus · William Freeman