Timezone: »
Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.
Author Information
Bryan McCann (Salesforce Research)
James Bradbury (Salesforce Research)
Caiming Xiong (Salesforce)
Richard Socher (MetaMind)
More from the Same Authors
-
2021 Poster: Evaluating State-of-the-Art Classification Models Against Bayes Optimality »
Ryan Theisen · Huan Wang · Lav Varshney · Caiming Xiong · Richard Socher -
2020 : Contributed Talk - ProGen: Language Modeling for Protein Generation »
Ali Madani · Bryan McCann · Nikhil Naik · · Possu Huang · Richard Socher -
2020 Poster: Towards Theoretically Understanding Why Sgd Generalizes Better Than Adam in Deep Learning »
Pan Zhou · Jiashi Feng · Chao Ma · Caiming Xiong · Steven Chu Hong Hoi · Weinan E -
2020 Poster: Theory-Inspired Path-Regularized Differential Network Architecture Search »
Pan Zhou · Caiming Xiong · Richard Socher · Steven Chu Hong Hoi -
2020 Oral: Theory-Inspired Path-Regularized Differential Network Architecture Search »
Pan Zhou · Caiming Xiong · Richard Socher · Steven Chu Hong Hoi -
2020 Poster: Online Structured Meta-learning »
Huaxiu Yao · Yingbo Zhou · Mehrdad Mahdavi · Zhenhui (Jessie) Li · Richard Socher · Caiming Xiong -
2020 Poster: Towards Understanding Hierarchical Learning: Benefits of Neural Representations »
Minshuo Chen · Yu Bai · Jason Lee · Tuo Zhao · Huan Wang · Caiming Xiong · Richard Socher -
2019 Poster: LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition »
Zuxuan Wu · Caiming Xiong · Yu-Gang Jiang · Larry Davis -
2019 Poster: Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards »
Alexander Trott · Stephan Zheng · Caiming Xiong · Richard Socher -
2016 : Richard Socher - Tackling the Limits of Deep Learning for NLP »
Richard Socher -
2014 Poster: Global Belief Recursive Neural Networks »
Romain Paulus · Richard Socher · Christopher Manning -
2013 Demonstration: Easy Text Classification with Machine Learning »
Richard Socher · Romain Paulus · Bryan McCann · Andrew Y Ng -
2013 Poster: Reasoning With Neural Tensor Networks for Knowledge Base Completion »
Richard Socher · Danqi Chen · Christopher D Manning · Andrew Y Ng -
2013 Poster: Zero-Shot Learning Through Cross-Modal Transfer »
Richard Socher · Milind Ganjoo · Christopher D Manning · Andrew Y Ng -
2012 Poster: Recursive Deep Learning on 3D Point Clouds »
Richard Socher · Bharath Bath · Brody Huval · Christopher D Manning · Andrew Y Ng -
2011 Poster: Unfolding Recursive Autoencoders for Paraphrase Detection »
Richard Socher · Eric H Huang · Jeffrey Pennin · Andrew Y Ng · Christopher D Manning -
2009 Poster: A Bayesian Analysis of Dynamics in Free Recall »
Richard Socher · Samuel J Gershman · Adler Perotte · Per Sederberg · David Blei · Kenneth Norman