Timezone: »
In this work we study loss functions for learning and evaluating probability distributions over large discrete domains. Unlike classification or regression where a wide variety of loss functions are used, in the distribution learning and density estimation literature, very few losses outside the dominant \emph{log loss} are applied. We aim to understand this fact, taking an axiomatic approach to the design of loss functions for distributions. We start by proposing a set of desirable criteria that any good loss function should satisfy. Intuitively, these criteria require that the loss function faithfully evaluates a candidate distribution, both in expectation and when estimated on a few samples. Interestingly, we observe that \emph{no loss function} possesses all of these criteria. However, one can circumvent this issue by introducing a natural restriction on the set of candidate distributions. Specifically, we require that candidates are \emph{calibrated} with respect to the target distribution, i.e., they may contain less information than the target but otherwise do not significantly distort the truth. We show that, after restricting to this set of distributions, the log loss and a large variety of other losses satisfy the desired criteria. These results pave the way for future investigations of distribution learning that look beyond the log loss, choosing a loss function based on application or domain need.
Author Information
Nika Haghtalab (Cornell University)
Cameron Musco (Microsoft Research)
Bo Waggoner (U. Colorado, Boulder)
More from the Same Authors
-
2022 Spotlight: Kernel Interpolation with Sparse Grids »
Mohit Yadav · Daniel Sheldon · Cameron Musco -
2022 Poster: Kernel Interpolation with Sparse Grids »
Mohit Yadav · Daniel Sheldon · Cameron Musco -
2022 Poster: Modeling Transitivity and Cyclicity in Directed Graphs via Binary Code Box Embeddings »
Dongxu Zhang · Michael Boratko · Cameron Musco · Andrew McCallum -
2022 Poster: Simplified Graph Convolution with Heterophily »
Sudhanshu Chanpuriya · Cameron Musco -
2022 Poster: Sample Constrained Treatment Effect Estimation »
Raghavendra Addanki · David Arbour · Tung Mai · Cameron Musco · Anup Rao -
2021 : Spotlight 3: Efficient Competitions and Online Learning with Strategic Forecasters »
Anish Thilagar · Rafael Frongillo · Bo Waggoner · Robert Gomez -
2021 Poster: On the Power of Edge Independent Graph Models »
Sudhanshu Chanpuriya · Cameron Musco · Konstantinos Sotiropoulos · Charalampos Tsourakakis -
2021 Poster: Coresets for Classification – Simplified and Strengthened »
Tung Mai · Cameron Musco · Anup Rao -
2020 Workshop: Machine Learning for Economic Policy »
Stephan Zheng · Alexander Trott · Annie Liang · Jamie Morgenstern · David Parkes · Nika Haghtalab -
2020 Poster: Fourier Sparse Leverage Scores and Approximate Kernel Learning »
Tamas Erdelyi · Cameron Musco · Christopher Musco -
2020 Spotlight: Fourier Sparse Leverage Scores and Approximate Kernel Learning »
Tamas Erdelyi · Cameron Musco · Christopher Musco -
2020 Poster: Node Embeddings and Exact Low-Rank Representations of Complex Networks »
Sudhanshu Chanpuriya · Cameron Musco · Konstantinos Sotiropoulos · Charalampos Tsourakakis -
2019 Workshop: Bridging Game Theory and Deep Learning »
Ioannis Mitliagkas · Gauthier Gidel · Niao He · Reyhane Askari Hemmat · N H · Nika Haghtalab · Simon Lacoste-Julien -
2019 Poster: Equal Opportunity in Online Classification with Partial Feedback »
Yahav Bechavod · Katrina Ligett · Aaron Roth · Bo Waggoner · Steven Wu -
2019 Poster: An Embedding Framework for Consistent Polyhedral Surrogates »
Jessica Finocchiaro · Rafael Frongillo · Bo Waggoner -
2017 Workshop: Learning in the Presence of Strategic Behavior »
Nika Haghtalab · Yishay Mansour · Tim Roughgarden · Vasilis Syrgkanis · Jennifer Wortman Vaughan -
2017 Poster: Collaborative PAC Learning »
Avrim Blum · Nika Haghtalab · Ariel Procaccia · Mingda Qiao -
2017 Poster: Online Learning with a Hint »
Ofer Dekel · arthur flajolet · Nika Haghtalab · Patrick Jaillet