Predictive Inference in Multi-environment Scenarios
in
Workshop: Statistical Frontiers in LLMs and Foundation Models
Abstract
We address the challenge of constructing valid confidence intervals and sets in problems of prediction across multiple environments. Weinvestigate two types of coverage suitable for these problems, extending thejackknife and split-conformal methods to show how to obtain distributionfree coverage in such non-traditional, hierarchical data-generating scenarios.Our contributions also include extensions for settings with non-real-valuedresponses and a theory of consistency for predictive inference in these general problems. We demonstrate a novel resizing method to adapt to problemdifficulty, which applies both to existing approaches for predictive inferencewith hierarchical data and the methods we develop; this reduces predictionset sizes using limited information from the test environment, a key to themethods’ practical performance, which we evaluate through neurochemicalsensing and species classification datasets.