Timezone: »
Active learning methods have shown great promise in reducing the number of samples necessary for learning. As automated learning systems are adopted into real-time, real-world decision-making pipelines, it is increasingly important that such algorithms are designed with safety in mind. In this work we investigate the complexity of learning the best safe decision in interactive environments. We reduce this problem to a safe linear bandits problem, where our goal is to find the best arm satisfying certain (unknown) safety constraints. We propose an adaptive experimental design-based algorithm, which we show efficiently trades off between the difficulty of showing an arm is unsafe vs suboptimal. To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints. In practice, we demonstrate that this approach performs well on synthetic and real world datasets.
Author Information
Romain Camilleri (University of Washington)
Andrew Wagenmaker (University of Washington)
Jamie Morgenstern (U Washington)
Lalit Jain (University of Washington)
Kevin Jamieson (U Washington)
More from the Same Authors
-
2023 Poster: Doubly Constrained Fair Clustering »
John Dickerson · Seyed Esmaeili · Jamie Morgenstern · Claire Jie Zhang -
2023 Poster: Scalable Membership Inference Attacks via Quantile Regression »
Martin Bertran · Shuai Tang · Aaron Roth · Michael Kearns · Jamie Morgenstern · Steven Wu -
2023 Poster: Optimal Exploration for Model-Based RL in Nonlinear Systems »
Andrew Wagenmaker · Guanya Shi · Kevin Jamieson -
2023 Poster: Active representation learning for general task space with applications in robotics »
Yifang Chen · Yingbing Huang · Simon Du · Kevin Jamieson · Guanya Shi -
2023 Poster: Experimental Designs for Heteroskedastic Variance »
Justin Weltz · Tanner Fiez · Alexander Volfovsky · Eric Laber · Blake Mason · houssam nassif · Lalit Jain -
2022 Poster: Instance-optimal PAC Algorithms for Contextual Bandits »
Zhaoqi Li · Lillian Ratliff · houssam nassif · Kevin Jamieson · Lalit Jain -
2022 Poster: Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design »
Andrew Wagenmaker · Kevin Jamieson -
2021 : Beyond No Regret: Instance-Dependent PAC Reinforcement Learning »
Andrew Wagenmaker · Kevin Jamieson -
2021 : Panel: Future directions for tackling distribution shifts »
Tatsunori Hashimoto · Jamie Morgenstern · Judy Hoffman · Andrew Beck -
2021 Poster: Selective Sampling for Online Best-arm Identification »
Romain Camilleri · Zhihan Xiong · Maryam Fazel · Lalit Jain · Kevin Jamieson -
2021 Poster: Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers »
Julian Katz-Samuels · Blake Mason · Kevin Jamieson · Rob Nowak -
2021 Poster: Corruption Robust Active Learning »
Yifang Chen · Simon Du · Kevin Jamieson -
2020 Workshop: Machine Learning for Economic Policy »
Stephan Zheng · Alexander Trott · Annie Liang · Jamie Morgenstern · David Parkes · Nika Haghtalab -
2020 Poster: An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits »
Julian Katz-Samuels · Lalit Jain · zohar karnin · Kevin Jamieson -
2020 Poster: Finding All $\epsilon$-Good Arms in Stochastic Bandits »
Blake Mason · Lalit Jain · Ardhendu Tripathy · Robert Nowak -
2019 Poster: A New Perspective on Pool-Based Active Classification and False-Discovery Control »
Lalit Jain · Kevin Jamieson -
2019 Poster: Sequential Experimental Design for Transductive Linear Bandits »
Lalit Jain · Kevin Jamieson · Tanner Fiez · Lillian Ratliff -
2019 Poster: Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs »
Max Simchowitz · Kevin Jamieson -
2018 Poster: A Bandit Approach to Sequential Experimental Design with False Discovery Control »
Kevin Jamieson · Lalit Jain