Skip to yearly menu bar Skip to main content

Workshop: Heavy Tails in ML: Structure, Stability, Dynamics

The Effects of Ensembling on Long-Tailed Data

Estefany Kelly Buchanan · Geoff Pleiss · John Cunningham

Keywords: [ ensembles ] [ long-tail data ] [ Imbalanced data ]


Deep ensembles are a popular approach to improve over single model performance (Lakshminarayanan et al. 2017), either by averaging logits (Hinton et al. 2015, Webb et al. 2020, Gontijo-Lopes et al. 2022), or probabilities of multiple models (Dietterich 2000, Lakshminarayanan et al. 2017, Kumar et al. 2022). Recent theoretical work has shown that logit and probability ensembles have different benefits (Gupta et al. 2022, Wood et al. 2023), but to our knowledge these ensembling approaches have not been compared systematically for balanced vs imbalanced data. In this work, we show that for balanced datasets, there is no significant difference between logit and probability ensembles in terms of accuracy and ranked calibration. However, we show that in long tailed datasets, there are gains from logit ensembling when combined with imbalance bias reduction losses. In turn, our results suggest that there are benefits to be gained from loss-aware ensembles when dealing with long-tail data.

Chat is not available.