Timezone: »
Current state-of-the-art document retrieval solutions mainly follow an index-retrieve paradigm, where the index is hard to be directly optimized for the final retrieval target. In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query. To optimize the recall performance of NCI, we invent a prefix-aware weight-adaptive decoder architecture, and leverage tailored techniques including query generation, semantic document identifiers, and consistency-based regularization. Empirical studies demonstrated the superiority of NCI on two commonly used academic benchmarks, achieving +21.4% and +16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on TriviaQA dataset, respectively, compared to the best baseline method.
Author Information
Yujing Wang (Microsoft)
Yingyan Hou (Tsinghua University, Tsinghua University)
Haonan Wang (national university of singaore, National University of Singapore)
Ziming Miao (Microsoft)
Shibin Wu (Tsinghua University)
Hao Sun (Peking University)
Qi Chen (Microsoft Research Asia)
Yuqing Xia (Peking University)
Chengmin Chi (Microsoft)
Guoshuai Zhao (Beijing University of Posts and Telecommunications)
Zheng Liu (The Hong Kong University of Science and Technology)
Xing Xie (Microsoft Research Asia)
Hao Sun (Microsoft)
Weiwei Deng (South China University of Technology)
Qi Zhang (Microsoft)
Mao Yang (Microsoft Research Asia)
More from the Same Authors
-
2021 Spotlight: SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search »
Qi Chen · Bing Zhao · Haidong Wang · Mingqin Li · Chuanjie Liu · Zengzhong Li · Mao Yang · Jingdong Wang -
2021 : WRENCH: A Comprehensive Benchmark for Weak Supervision »
Jieyu Zhang · Yue Yu · · Yujing Wang · Yaming Yang · Mao Yang · Alexander Ratner -
2022 Poster: Deep Active Learning by Leveraging Training Dynamics »
Haonan Wang · Wei Huang · Ziwei Wu · Hanghang Tong · Andrew J Margenot · Jingrui He -
2022 Poster: USB: A Unified Semi-supervised Learning Benchmark for Classification »
Yidong Wang · Hao Chen · Yue Fan · Wang SUN · Ran Tao · Wenxin Hou · Renjie Wang · Linyi Yang · Zhi Zhou · Lan-Zhe Guo · Heli Qi · Zhen Wu · Yu-Feng Li · Satoshi Nakamura · Wei Ye · Marios Savvides · Bhiksha Raj · Takahiro Shinozaki · Bernt Schiele · Jindong Wang · Xing Xie · Yue Zhang -
2022 Poster: FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning »
Tao Qi · Fangzhao Wu · Chuhan Wu · Lingjuan Lyu · Tong Xu · Hao Liao · Zhongliang Yang · Yongfeng Huang · Xing Xie -
2022 Poster: Understanding Programmatic Weak Supervision via Source-aware Influence Function »
Jieyu Zhang · Haonan Wang · Cheng-Yu Hsieh · Alexander Ratner -
2022 Poster: Self-explaining deep models with logic rule reasoning »
Seungeon Lee · Xiting Wang · Sungwon Han · Xiaoyuan Yi · Xing Xie · Meeyoung Cha -
2022 Poster: Recommender Forest for Efficient Retrieval »
Chao Feng · Wuchao Li · Defu Lian · Zheng Liu · Enhong Chen -
2021 : WRENCH: A Comprehensive Benchmark for Weak Supervision »
Jieyu Zhang · Yue Yu · · Yujing Wang · Yaming Yang · Mao Yang · Alexander Ratner -
2021 : Billion-Scale Approximate Nearest Neighbor Search Challenge + Q&A »
Harsha Vardhan Simhadri · George Williams · Martin Aumüller · Artem Babenko · Dmitry Baranchuk · Qi Chen · Matthijs Douze · Ravishankar Krishnawamy · Gopal Srinivasa · Suhas Jayaram Subramanya · Jingdong Wang -
2021 Poster: GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph »
Junhan Yang · Zheng Liu · Shitao Xiao · Chaozhuo Li · Defu Lian · Sanjay Agrawal · Amit Singh · Guangzhong Sun · Xing Xie -
2021 Poster: SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search »
Qi Chen · Bing Zhao · Haidong Wang · Mingqin Li · Chuanjie Liu · Zengzhong Li · Mao Yang · Jingdong Wang -
2020 Poster: Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting »
Defu Cao · Yujing Wang · Juanyong Duan · Ce Zhang · Xia Zhu · Congrui Huang · Yunhai Tong · Bixiong Xu · Jing Bai · Jie Tong · Qi Zhang -
2020 Spotlight: Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting »
Defu Cao · Yujing Wang · Juanyong Duan · Ce Zhang · Xia Zhu · Congrui Huang · Yunhai Tong · Bixiong Xu · Jing Bai · Jie Tong · Qi Zhang -
2020 Poster: Sampling-Decomposable Generative Adversarial Recommender »
Binbin Jin · Defu Lian · Zheng Liu · Qi Liu · Jianhui Ma · Xing Xie · Enhong Chen