Timezone: »

Catastrophic Failures of Neural Active Learning on Heteroskedastic Distributions
Savya Khosla · Alex Lamb · Jordan Ash · Cyril Zhang · Kenji Kawaguchi
Event URL: https://openreview.net/forum?id=6Mfsc-BYp2d »

Models which can actively seek out the best quality training data hold the promise of more accurate, adaptable, and efficient machine learning. State-of-the-art techniques tend to prefer examples which are the most difficult to classify. While this works well on homogeneous datasets, we find that it can lead to catastrophic failures when performing active learning on multiple distributions which have different degrees of label noise (heteroskedasticity). Most active learning algorithms strongly prefer to draw from the distribution with more noise, even if its examples have no informative structure (such as solid color images). We find that active learning which encourages diversity and model uncertainty in the selected examples can significantly mitigate these failures. We hope these observations are immediately useful to practitioners and can lead to the construction of more realistic and challenging active learning benchmarks.

Author Information

Savya Khosla (Delhi Technological University (Delhi College of Engineering), Dhirubhai Ambani Institute Of Information and Communication Technology)
Alex Lamb (Universite de Montreal)
Jordan Ash (Microsoft Research)
Cyril Zhang (Microsoft Research NYC)
Kenji Kawaguchi (MIT)

More from the Same Authors