Timezone: »
Consider a setting where we wish to automate an expensive task with a machine learning algorithm using a limited labeling resource. In such settings, examples routed for labeling are often out of scope for the machine learning algorithm. For example, in a spam detection setting, human reviewers not only provide labeled data but are such high-quality detectors of spam that examples routed to them no longer require machine evaluation. As a consequence, the distribution of examples routed to the machine is intimately tied to the process generating labels. We introduce a formalization of this setting, and give an algorithm that simultaneously learns a model and decides when to request a label by leveraging ideas from both the abstention and active learning literatures. We prove an upper bound on the algorithm's label complexity and a matching lower bound for any algorithm in this setting. We conduct a thorough set of experiments including an ablation study to test different components of our algorithm. We demonstrate the effectiveness of an efficient version of our algorithm over margin sampling on a variety of datasets.
Author Information
Kareem Amin (Google Research)
Giulia DeSalvo (Google Research)
Afshin Rostamizadeh (Google Research)
More from the Same Authors
-
2021 Spotlight: Online Active Learning with Surrogate Loss Functions »
Giulia DeSalvo · Claudio Gentile · Tobias Sommer Thune -
2021 Poster: Batch Active Learning at Scale »
Gui Citovsky · Giulia DeSalvo · Claudio Gentile · Lazaros Karydas · Anand Rajagopalan · Afshin Rostamizadeh · Sanjiv Kumar -
2021 Poster: Online Active Learning with Surrogate Loss Functions »
Giulia DeSalvo · Claudio Gentile · Tobias Sommer Thune -
2021 Poster: Learning with User-Level Privacy »
Daniel Levy · Ziteng Sun · Kareem Amin · Satyen Kale · Alex Kulesza · Mehryar Mohri · Ananda Theertha Suresh -
2020 Poster: An Analysis of SVD for Deep Rotation Estimation »
Jake Levinson · Carlos Esteves · Kefan Chen · Noah Snavely · Angjoo Kanazawa · Afshin Rostamizadeh · Ameesh Makadia -
2019 : Pan-Private Uniformity Testing »
Kareem Amin · Matthew Joseph -
2019 Poster: Differentially Private Covariance Estimation »
Kareem Amin · Travis Dick · Alex Kulesza · Andres Munoz Medina · Sergei Vassilvitskii -
2017 Poster: Repeated Inverse Reinforcement Learning »
Kareem Amin · Nan Jiang · Satinder Singh -
2017 Spotlight: Repeated Inverse Reinforcement Learning »
Kareem Amin · Nan Jiang · Satinder Singh -
2015 Workshop: The 1st International Workshop "Feature Extraction: Modern Questions and Challenges" »
Dmitry Storcheus · Sanjiv Kumar · Afshin Rostamizadeh -
2014 Poster: Repeated Contextual Auctions with Strategic Buyers »
Kareem Amin · Afshin Rostamizadeh · Umar Syed -
2013 Poster: Learning Prices for Repeated Auctions with Strategic Buyers »
Kareem Amin · Afshin Rostamizadeh · Umar Syed -
2009 Poster: Learning Non-Linear Combinations of Kernels »
Corinna Cortes · Mehryar Mohri · Afshin Rostamizadeh -
2008 Workshop: Kernel Learning: Automatic Selection of Optimal Kernels »
Corinna Cortes · Arthur Gretton · Gert Lanckriet · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Spotlight: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Rademacher Complexity Bounds for Non-I.I.D. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2007 Poster: Stability Bounds for Non-i.i.d. Processes »
Mehryar Mohri · Afshin Rostamizadeh