Skip to yearly menu bar Skip to main content

Workshop: Medical Imaging Meets NeurIPS

Comparing Sparse and Deep Neural Network(NN)s: Using AI to Detect Cancer.

Charles Strauss


Human pathologists inspect pathology slides containing millions of cells but even experts disagree on diagnosis. While Deep learning has shown human pathologist level success on the task of tumor discovery, it is hard to decipher why a classification decision was reached. Previously, adversarial examples have been used to visualize the decision criteria employed by deep learning algorithms, and often demonstrate that classifications hinge on non-semantic features. Here, we demonstrate that adversarial examples to tumor detector NN models exist. We compare the relative robustness to adversarial examples of two types of autoencoders, based either on deep NNs or on sparse-coding. Our models consist of an autoencoder, whose latent representation is fed into a cell-level classifier. We attack the models with adversarial examples, analyze the attack, and test how these attacks transfer to the model it was not built for. We found that the latent representations of both types of autoencoders did well at reconstructing pathologist generated, pixel-level annotations and thus supported tumor detection at the cell level. Both models supported cell-level-classification AUC ROC scores of approximately 0.85 on holdout slides. Small (1%) adversarial perturbations were made to attack either model. Successful attacks on the deep model appeared to be random patterns (i.e. non-semantic), while successful attacks on the sparse model displayed cell-like features(i.e. potentially semantic). The deep model was attacked through the Fast Gradient Sign Method (FGSM), whereas we demonstrate a novel method for attacking the sparse model: run FGSM on a deep classifier that uses the sparse latent representation as its inputs and reconstructing an image from that attacked sparse latent representation. Adversarial examples made for one model did not successfully transfer to the opposite model, suggesting that the two classifiers use different criteria for classification.

Chat is not available.