It is inevitable that nowadays the AI systems are trained on inaccurate, missing, or wrong label information for classification, detection, or segmentation tasks. However, without accurate labels as ground truth, AI algorithms, especially those powered by deep neural networks, tend to perform badly. This is the infamous phenomenon called the memorization effect of deep learning where deep nets tend to learn from clean labels first, gradually adapt to noisy labels, and eventually overfit to completely random noise. Such a property of deep learning can cause poor generalization on test sets. Here we present a thorough study of augmenting deterministic models with Monte Carlo Dropout when training with both synthetic and real-world label noise settings. We investigate the classification efficacy, network sparsity and neuron responsiveness on label noise simulated via a class-conditional transition matrix and examine the method’s effectiveness on real-world dataset containing human annotation noise.