Timezone: »

Subclass-Dominant Label Noise: A Counterexample for the Success of Early Stopping
Yingbin Bai · Zhongyi Han · Erkun Yang · Jun Yu · Bo Han · Dadong Wang · Tongliang Liu

Tue Dec 12 03:15 PM -- 05:15 PM (PST) @ Great Hall & Hall B1+B2 #2028
Event URL: https://github.com/tmllab/2023_NeurIPS_SDN »

In this paper, we empirically investigate a previously overlooked and widespread type of label noise, subclass-dominant label noise (SDN). Our findings reveal that, during the early stages of training, deep neural networks can rapidly memorize mislabeled examples in SDN. This phenomenon poses challenges in effectively selecting confident examples using conventional early stopping techniques. To address this issue, we delve into the properties of SDN and observe that long-trained representations are superior at capturing the high-level semantics of mislabeled examples, leading to a clustering effect where similar examples are grouped together. Based on this observation, we propose a novel method called NoiseCluster that leverages the geometric structures of long-trained representations to identify and correct SDN. Our experiments demonstrate that NoiseCluster outperforms state-of-the-art baselines on both synthetic and real-world datasets, highlighting the importance of addressing SDN in learning with noisy labels. The code is available at https://github.com/tmllab/2023NeurIPSSDN.

Author Information

Yingbin Bai (The University of Sydney)
Zhongyi Han (Shandong University)
Erkun Yang (Xidian University)
Jun Yu (University of Science and Technology of China)
Dadong Wang (CSIRO)
Tongliang Liu (The University of Sydney)

More from the Same Authors