Timezone: »
Poster
On the Epistemic Limits of Personalized Prediction
Lucas Monteiro Paes · Carol Long · Berk Ustun · Flavio Calmon
Machine learning models are often personalized by using group attributes that encode personal characteristics (e.g., sex, age group, HIV status). In such settings, individuals expect to receive more accurate predictions in return for disclosing group attributes to the personalized model. We study when we can tell that a personalized model upholds this principle for every group who provides personal data. We introduce a metric called the benefit of personalization (BoP) to measure the smallest gain in accuracy that any group expects to receive from a personalized model. We describe how the BoP can be used to carry out basic routines to audit a personalized model, including: (i) hypothesis tests to check that a personalized model improves performance for every group; (ii) estimation procedures to bound the minimum gain in personalization. We characterize the reliability of these routines in a finite-sample regime and present minimax bounds on both the probability of error for BoP hypothesis tests and the mean-squared error of BoP estimates. Our results show that we can only claim that personalization improves performance for each group who provides data when we explicitly limit the number of group attributes used by a personalized model. In particular, we show that it is impossible to reliably verify that a personalized classifier with $k \geq 19$ binary group attributes will benefit every group who provides personal data using a dataset of $n = 8\times10^9$ samples -- one for each person in the world.
Author Information
Lucas Monteiro Paes (Harvard University)
I am a second-year Applied Mathematics Ph.D. student in the School of Engineering and Applied Sciences (SEAS) at Harvard University, working with Prof. Flavio Calmon. My main research interests are fairness, information theory, and machine learning applications for the social good. Before joining Harvard, I received an M.s. in Computational Mathematics and Modelling from Instituto de Matemática Pura e Aplicada (IMPA) in Brazil. You can find my CV with a list of all my publications here.
Carol Long (Harvard University)
Berk Ustun (UC San Diego)
Flavio Calmon (Harvard University)
More from the Same Authors
-
2021 : Who Gets the Benefit of the Doubt? Racial Bias in Machine Learning Algorithms Applied to Secondary School Math Education »
Haewon Jeong · Michael D. Wu · Nilanjana Dasgupta · Muriel Medard · Flavio Calmon -
2021 : Learning through Recourse under Censoring »
Jennifer Chien · Berk Ustun · Margaret Roberts -
2021 : Learning through Recourse under Censoring »
Jennifer Chien · Berk Ustun · Margaret Roberts -
2022 : Predictive Multiplicity in Probabilistic Classification »
Jamelle Watson-Daniels · David Parkes · Berk Ustun -
2022 : When Personalization Harms: Reconsidering the Use of Group Attributes of Prediction »
Vinith Suriyakumar · Marzyeh Ghassemi · Berk Ustun -
2022 : Participatory Systems for Personalized Prediction »
Hailey James · Chirag Nagpal · Katherine Heller · Berk Ustun -
2022 : Predictive Multiplicity in Probabilistic Classification »
Jamelle Watson-Daniels · David Parkes · Berk Ustun -
2022 : Participatory Systems for Personalized Prediction »
Hailey James · Berk Ustun · Chirag Nagpal · Katherine Heller -
2022 : Participatory Systems for Personalized Prediction »
Hailey James · Chirag Nagpal · Katherine Heller · Berk Ustun -
2022 : Participatory Systems for Personalized Prediction »
Hailey James · Chirag Nagpal · Katherine Heller · Berk Ustun -
2022 Panel: Panel 1C-7: Beyond Adult and… & Uncalibrated Models Can… »
Kailas Vodrahalli · Flavio Calmon -
2022 Poster: Rashomon Capacity: A Metric for Predictive Multiplicity in Classification »
Hsiang Hsu · Flavio Calmon -
2022 Poster: Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection »
Wael Alghamdi · Hsiang Hsu · Haewon Jeong · Hao Wang · Peter Michalak · Shahab Asoodeh · Flavio Calmon -
2021 Poster: Learning Optimal Predictive Checklists »
Haoran Zhang · Quaid Morris · Berk Ustun · Marzyeh Ghassemi -
2021 Poster: Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels »
Hao Wang · Yizhe Huang · Rui Gao · Flavio Calmon -
2020 : Invited Talk 4: Actionable Recourse in Machine Learning »
Berk Ustun -
2018 : Posters 1 »
Wei Wei · Flavio Calmon · Travis Dick · Leilani Gilpin · Maroussia Lévesque · Malek Ben Salem · Michael Wang · Jack Fitzsimons · Dimitri Semenovich · Linda Gu · Nathaniel Fruchter -
2017 Poster: Optimized Pre-Processing for Discrimination Prevention »
Flavio Calmon · Dennis Wei · Bhanukiran Vinzamuri · Karthikeyan Natesan Ramamurthy · Kush Varshney