Timezone: »
Multi-hop reasoning (i.e., reasoning across two or more documents) is a key ingredient for NLP models that leverage large corpora to exhibit broad knowledge. To retrieve evidence passages, multi-hop models must contend with a fast-growing search space across the hops, represent complex queries that combine multiple information needs, and resolve ambiguity about the best order in which to hop between training passages. We tackle these problems via Baleen, a system that improves the accuracy of multi-hop retrieval while learning robustly from weak training signals in the many-hop setting. To tame the search space, we propose condensed retrieval, a pipeline that summarizes the retrieved passages after each hop into a single compact context. To model complex queries, we introduce a focused late interaction retriever that allows different parts of the same query representation to match disparate relevant passages. Lastly, to infer the hopping dependencies among unordered training passages, we devise latent hop ordering, a weak-supervision strategy in which the trained retriever itself selects the sequence of hops. We evaluate Baleen on retrieval for two-hop question answering and many-hop claim verification, establishing state-of-the-art performance.
Author Information
Omar Khattab (Stanford University)
Christopher Potts (Stanford University)
Matei Zaharia (Stanford and Databricks)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval »
Dates n/a. Room
More from the Same Authors
-
2021 : ReaSCAN: Compositional Reasoning in Language Grounding »
Zhengxuan Wu · Elisa Kreiss · Desmond Ong · Christopher Potts -
2022 Poster: CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior »
Eldar D Abraham · Karel D'Oosterlinck · Amir Feder · Yair Gat · Atticus Geiger · Christopher Potts · Roi Reichart · Zhengxuan Wu -
2021 : Intuitive Image Descriptions are Context-Sensitive »
Shayan Hooshmand · Elisa Kreiss · Christopher Potts -
2021 Poster: Causal Abstractions of Neural Networks »
Atticus Geiger · Hanson Lu · Thomas Icard · Christopher Potts -
2021 Poster: Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP »
Josh Rozner · Christopher Potts · Kyle Mahowald -
2021 Poster: Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking »
Zhiyi Ma · Kawin Ethayarajh · Tristan Thrush · Somya Jain · Ledell Wu · Robin Jia · Christopher Potts · Adina Williams · Douwe Kiela -
2020 Poster: FrugalML: How to use ML Prediction APIs more accurately and cheaply »
Lingjiao Chen · Matei Zaharia · James Zou -
2020 Oral: FrugalML: How to use ML Prediction APIs more accurately and cheaply »
Lingjiao Chen · Matei Zaharia · James Zou