Information retrieval and ranking systems mediate access to information by directing users’ attention when they issue search queries. Rankings produced by such systems often have high real-world impact: for example, in an online recruitment context, a recruiter’s “attention” translates to economically consequential opportunities. Training robust and fair ranking and retrieval systems is hence important to ensure that attention is allocated fairly. In general, these systems work by predicting a “relevance” score – indicating how relevant an item is to a query issued. In our work, we first inspect regions of low performance in existing algorithms for ranking and content recommendation that dominate commercial use of machine learning. We find that low performance in ranking quality is closely related to higher uncertainty in relevance scores.Then, our proposed framework consists of a set of expert models and a defer-vs-no-defer prediction model. The expert models either utilize more information about the user (e.g., their demographic characteristics) or incorporate more context around the search query (e.g., explicitly asking for more words in the query) to improve performance. The deferral decision is made using uncertainty in ranking scores. The utility and fairness of our approach will be benchmarked in multiple setups: from ranking using past user interactions with information retrieval systems as well as online hiring and health information ranking settings.