Workshop: Medical Imaging Meets NeurIPS

Jonas Teuwen, Marleen de Bruijne, Qi Dou, Ben Glocker, Ipek Oguz, Aasa Feragen, Hervé Lombaert, Ender Konukoglu

2020-12-12T02:30:00-08:00 - 2020-12-12T11:25:00-08:00
Abstract: 'Medical Imaging meets NeurIPS' is a satellite workshop established in 2017. The workshop aims to bring researchers together from the medical image computing and machine learning communities. The objective is to discuss the major challenges in the field and opportunities for joining forces. This year the workshop will feature online oral and poster sessions with an emphasis on audience interactions. In addition, there will be a series of high-profile invited speakers from industry, academia, engineering and medical sciences giving an overview of recent advances, challenges, latest technology and efforts for sharing clinical data.

Medical imaging is facing a major crisis with an ever increasing complexity and volume of data and immense economic pressure. The interpretation of medical images pushes human abilities to the limit with the risk that critical patterns of disease go undetected. Machine learning has emerged as a key technology for developing novel tools in computer aided diagnosis, therapy and intervention. Still, progress is slow compared to other fields of visual recognition which is mainly due to the domain complexity and constraints in clinical applications which require most robust, accurate, and reliable solutions. The workshop aims to raise the awareness of the unmet needs in machine learning for successful applications in medical imaging.

Video

Chat

Chat is not available.

Schedule

2020-12-12T02:30:00-08:00 - 2020-12-12T03:00:00-08:00
Keynote by Lena Maier-Hein: Addressing the Data Bottleneck in Biomedical Image Analysis
Lena Maier-Hein
Machine learning has begun to revolutionize almost all areas of health research. Success stories cover a wide variety of application fields ranging from radiology and dermatology to gastroenterology and mental health applications. Strikingly, however, such widely known success stories appear to be lacking in some subfields of healthcare, such as surgery. A main reason for this phenomenon could be the lack of large annotated training data sets. In the past years, we have investigated the hypothesis that this bottleneck can be overcome by simulated data. This talk will highlight some of the successes and challenges we encountered on our journey.
2020-12-12T03:10:00-08:00 - 2020-12-12T03:20:00-08:00
DeepSim: Semantic similarity metrics for learned image registration
Steffen Czolbe
We propose a semantic similarity metric for image registration. Existing metrics like euclidean distance or normalized cross-correlation focus on aligning intensity values, giving difficulties with low intensity contrast or noise. Our semantic approach learns dataset-specific features that drive the optimization of a learning-based registration model. Comparing to existing unsupervised and supervised methods across multiple image modalities and applications, we achieve consistently high registration accuracy and faster convergence than state of the art, and the learned invariance to noise gives smoother transformations on low-quality images.
2020-12-12T03:20:00-08:00 - 2020-12-12T03:30:00-08:00
Representing Ambiguity in Registration Problems with Conditional Invertible Neural Networks
Darya Trofimova
Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a crucial importance in such a scenario. However, previously proposed methods, including classical iterative registration methods and deep learning-based methods have one characteristic in common: They lack the capacity to represent the fact that a registration problem may be inherently ambiguous, meaning that multiple (substantially different) plausible solutions exist. To tackle this limitation, we explore the application of invertible neural networks (INN) as core component of a registration methodology. In the proposed framework, INNs enable going beyond point estimates as network output by representing the possible solutions to a registration problem by a probability distribution that encodes different plausible solutions via multiple modes. In a first feasibility study, we test the approach for a 2D/3D registration setting by registering spinal CT volumes to X-ray images. To this end, we simulate the X-ray images taken by a C-Arm with multiple orientations using the principle of digitially reconstructed radiographs (DRRs). Due to the symmetry of human spine, there are potentially multiple substantially different poses of the C-Arm that can lead to similar projections. The hypothesis of this work is that the proposed approach is able to identify multiple solutions in such ambiguous registration problems.
2020-12-12T03:30:00-08:00 - 2020-12-12T05:00:00-08:00
Poster Session 1
2020-12-12T05:00:00-08:00 - 2020-12-12T05:30:00-08:00
Keynote by Nathan Silberman: Real-world Insights from Patient-facing Machine Learning Models
Nathan Silberman
While many machine learning products have a typical pathway for development, those in the medical imaging domain require a unique approach due to the higher bar for safety, efficacy, and the realities of clinical practice. In this talk, Nathan Silberman will discuss insights gained from launching and monitoring medical imaging machine learning products in clinically demanding settings.
2020-12-12T05:40:00-08:00 - 2020-12-12T05:50:00-08:00
Using StyleGAN for Visual Interpretability of Deep Learning Models on Medical Images
Kathryn Schutte
As AI-based medical devices are becoming more common in imaging fields like radiology and histology, interpretability of the underlying predictive models is crucial to expand their use in clinical practice. Existing heatmap-based interpretability methods such as GradCAM only highlight the location of predictive features but do not explain how they contribute to the prediction. In this paper, we propose a new interpretability method that can be used to understand the predictions of any black-box model on images, by showing how the input image would be modified in order to produce different predictions. A StyleGAN is trained on medical images to provide a mapping between latent vectors and images. Our method identifies the optimal direction in the latent space to create a change in the model prediction. By shifting the latent representation of an input image along this direction, we can produce a series of new synthetic images with changed predictions. We validate our approach on histology and radiology images, and demonstrate its ability to provide meaningful explanations that are more informative than GradCAM heatmaps. Our method reveals the patterns learned by the model, which allows clinicians to build trust in the model’s predictions, discover new biomarkers and eventually reveal potential biases.
2020-12-12T05:50:00-08:00 - 2020-12-12T06:00:00-08:00
Context-aware Self-supervised Learning for Medical Images Using Graph Neural Network
Li Sun
Although self-supervised learning enables us to bootstrap the training by exploiting unlabeled data, the generic self-supervised methods for natural images do not sufficiently incorporate the context. For medical images, a desirable method should be sensitive enough to detect deviation from normal-appearing tissue of each anatomical region; here, anatomy is the context. We introduce a novel approach with two levels of self-supervised representation learning objectives: one on the regional anatomical level and another on the patient-level. We use graph neural networks to incorporate the relationship between different anatomical regions. The structure of the graph is informed by anatomical correspondences between each patient and an anatomical atlas. In addition, the graph representation has the advantage of handling any arbitrarily sized image in full resolution. Experiments on large-scale Computer Tomography (CT) datasets of lung images show that our approach compares favorably to baseline methods that do not account for the context. We uses the learnt embedding for staging lung tissue abnormalities related with COVID-19.
2020-12-12T06:00:00-08:00 - 2020-12-12T06:45:00-08:00
Break
2020-12-12T06:45:00-08:00 - 2020-12-12T07:15:00-08:00
Keynote by Spyridon Bakas: The Federated Tumor Segmentation (FeTS) Initiative: Towards a paradigm-shift in multi-institutional collaborations
Spyros Bakas
Spyridon Bakas talk will revolve around his most recent focus on federated learning (FL), where he co-authored what seems to be the first study on FL in medicine, and has been funded by the Informatics Technology for Cancer Research (ITCR) program of the National Cancer Institute of the National Institutes of Health (NIH) to develop the federated tumor segmentation (FeTS - https://www.fets.ai/) platform, in collaboration with Intel, that enables the first-ever real-world consortium of 43 international institutions (so far) looking into FL for tumor segmentation, starting with brain tumors.
2020-12-12T07:25:00-08:00 - 2020-12-12T07:35:00-08:00
Deep learning to assist radiologists in breast cancer diagnosis with ultrasound imaging
Yiqiu Shen
Sonography is an important tool in the detection and characterization of breast masses. Though consistently shown to detect additional cancers as a supplemental imaging modality, breast ultrasound has been noted to have a high false-positive rate relative to mammography and magnetic resonance imaging. Here, we propose a deep neural network that can detect benign and malignant lesions in breast ultrasound images. The network achieves an area under the receiver operating characteristic curve (AUROC) of 0.902 (95\% CI: 0.892-0.911) on a test set consisting of 103,611 exams (around 2 million images) collected at Anonymized Institution between 2012 and 2019. To confirm its generalizability, we evaluated the network on an independent external test set on which it achieved an AUROC of 0.908 (95\% CI: 0.884 - 0.933). This highlights the potential of AI in improving accuracy, consistency, and efficiency of breast ultrasound diagnostics worldwide.
2020-12-12T07:35:00-08:00 - 2020-12-12T07:45:00-08:00
Privacy-preserving medical image analysis
Alex Ziller
The utilisation of artificial intelligence in medicine and healthcare has led to successful clinical applications in several domains. The conflict between data usage and privacy protection requirements in such systems must be resolved for optimal results as well as ethical and legal compliance. This calls for innovative solutions such as privacy-preserving machine learning (PPML). We present PriMIA (Privacy-preserving Medical Image Analysis), a software framework designed for PPML in medical imaging. In a real-life case study we demonstrate significantly better classification performance of a securely aggregated federated learning model compared to human experts on unseen datasets. Furthermore, we show an inference-as-a-service scenario for end-to-end encrypted diagnosis, where neither the data nor the model are revealed. Lastly, we empirically evaluate the framework's security against a gradient-based model inversion attack and demonstrate that no usable information can be recovered from the model.
2020-12-12T07:45:00-08:00 - 2020-12-12T09:00:00-08:00
Poster Session 2
2020-12-12T09:00:00-08:00 - 2020-12-12T09:30:00-08:00
Keynote by Jerry Prince: New Approaches for Magnetic Resonance Image Harmonization
Jerry L Prince
Magnetic resonance (MR) images have exquisite soft tissue contrast and are critical to modern clinical imaging and medical science research. Automatic processing of MR images has always been hampered, however, by the lack of standardized tissue contrasts with standardized intensity scales. MR image harmonization or intensity normalization has long been investigated and used as part of neuroimaging pipelines to try to make quantitative measures compatible between MR scanners and across sites, but this has been a difficult task. In this talk, I will give an overview of past work and then describe three new harmonization approaches, each facilitated by a different style of deep network, that have recently been developed in my lab. The first approach is based on image synthesis, the second on domain adaptivity, and the third on a disentangled latent space. I will present brief overviews and results for each method and then discuss their limitations as well as needs and opportunities for future research on MR image harmonization.
2020-12-12T09:40:00-08:00 - 2020-12-12T09:50:00-08:00
Brain2Word: Improving Brain Decoding Methods and Evaluation
Damian Pascual Ortiz
Brain decoding, understood as the process of mapping brain activities to the stimuli that generated them, has been an active research area in the last years. In the case of language stimuli, recent studies have shown that it is possible to decode fMRI scans into an embedding of the word a subject is reading. However, such word embeddings are designed for natural language processing tasks rather than for brain decoding. Therefore, they limit our ability to recover the precise stimulus. In this work, we propose to directly classify an fMRI scan, mapping it to the corresponding word within a fixed vocabulary. Unlike existing work, we evaluate on scans from previously unseen subjects. We argue that this is a more realistic setup and we present a model that can decode fMRI data from unseen subjects. Our model achieves 5.22% Top-1 and 13.59% Top-5 accuracy in this challenging task, significantly outperforming all the considered competitive baselines.
2020-12-12T09:50:00-08:00 - 2020-12-12T10:00:00-08:00
3D Infant Pose Estimation Using Transfer Learning
Simon Ellershaw
This paper presents the first deep learning-based 3D infant pose estimation model. We transfer-learn models first trained in the adult domain. The model outperforms the current 2D and 3D state-of-the-art on the synthetic infant MINI-RGBD test dataset, achieving an average joint position error (AJPE) of 8.17 pixels and 28.47 mm respectively. Furthermore, unlike the current 3D state-of-the-art, the model presented here does not require a depth channel as input. This is an important step in the development of an automated general movement assessment tool for infants, which has the potential to support the diagnosis of a range of neurological disorders, including cerebral palsy.
2020-12-12T10:00:00-08:00 - 2020-12-12T10:10:00-08:00
FastMRI Introduction
Matthew J Muckley
Shortening the scan time for acquiring an MR image is a major outstanding problem for the MRI community. To engage the community towards this objective, we hosted the second fastMRI competition for reconstructing MR images with subsampled k-space data. The data set for the 2020 competition focused on brain images and included 7,299 anonymized, fully-sampled brain scans, with 894 of these held back for challenge evaluation purposes. Our challenge included a qualitative evaluation component where radiologists assessed submissions for “quality of depiction of pathology.” Our challenge also introduced a Transfer track, where participants were asked to run their models on scanners from MRI manufacturers from outside the data set. Results showed one team scoring best in both SSIM scores and qualitative radiologist evaluations, establishing a new state-of-the-art for MRI acceleration.
2020-12-12T10:10:00-08:00 - 2020-12-12T10:15:00-08:00
Q&A FastMRI introduction
2020-12-12T10:15:00-08:00 - 2020-12-12T10:25:00-08:00
FastMRI Talk 1
Mahmoud Mostapha
2020-12-12T10:25:00-08:00 - 2020-12-12T10:35:00-08:00
FastMRI Talk 2
Zaccharie Ramzi
2020-12-12T10:35:00-08:00 - 2020-12-12T10:45:00-08:00
FastMRI Talk 3
Sunwoo Kim
2020-12-12T10:45:00-08:00 - 2020-12-12T10:50:00-08:00
Q&A FastMRI talk 1-3
Combined Q&A for FastMRI talks 1-3.
2020-12-12T10:50:00-08:00 - 2020-12-12T11:20:00-08:00
FastMRI keynote
Yvonne Lui
Clinical Validation of Machine Learning Algorithm Generated Images
Fred Kwon
Generative machine learning (ML) methods can reduce time, cost, and radiation associated with medical image acquisition, compression, or generation techniques. While quantitative metrics are commonly used in the evaluation of ML generated images, it is unknown how well these quantitative metrics relate to the diagnostic utility of images. Here, fellowship-trained radiologists provided diagnoses and qualitative evaluations on chest radiographs reconstructed from the current standard JPEG2000 or variational autoencoder (VAE) techniques. Cohen’s kappa coefficient measured the agreement of diagnoses based on different reconstructions. Methods that produced similar Fréchet inception distance (FID) showed similar diagnostic performances. Thus in place of time-intensive expert radiologist verification, an appropriate target FID -- an objective quantitative metric -- can evaluate the clinical utility of ML generated medical images.
Joint Hierarchical Bayesian Learning of Full-structure Noise for Brain Source Imaging
Ali Hashemi
Many problems in human brain imaging involve hierarchical Bayesian (type-II maximum likelihood) regression models for observations with latent variables for source and noise, where parameters of priors for source and noise terms need to be estimated jointly from data. One example is the biomagnetic inverse problems, where crucial factors influencing accuracy of brain source estimation are not only the noise level but also its correlation structure. Importantly, existing approaches have not addressed estimation of a full-structure noise covariance matrix. Using ideas from Riemannian geometry, we derive an efficient algorithm for updating both source and a full-structure noise covariance along the manifold of positive definite matrices. Our results demonstrate that the novel framework significantly improves upon state-of-the-art techniques in the real-world scenario with fully-structured noise covariance.
Quantification of task similarity for efficient knowledge transfer in biomedical image analysis
Patrick Scholz
Shortage of annotated data is one of the greatest bottlenecks related to deep learning in healthcare. Methods proposed to address this issue include transfer learning, crowdsourcing and self-supervised learning. More recently, first attempts to leverage the concept of meta learning have been made. Meta learning studies how learning systems can increase in efficiency through experience, where experience can be represented by solutions to tasks connected to previously acquired data, for example. A core capability of meta learning-based approaches is the identification of similar previous tasks given a new task. Quantifying the similarity between tasks, however, is an open research problem. We address this challenge by investigating two complementary approaches: (1) Leveraging images and labels to embed a complete data set in a vector of fixed length that serves as a task fingerprint (2) Directly comparing the distributions of the images with sample-based and optimal transport-based methods, thereby neglecting the labels.
A Bayesian Unsupervised Deep-Learning Based Approach for Deformable Image Registration
Samah Khawaled
Unsupervised deep-learning (DL) models were recently proposed for deformable image registration tasks. In such models, a neural-network is trained to predict the best deformation field by minimizing some dissimilarity function between the moving and the target images. We introduce a fully Bayesian framework for unsupervised DL-based deformable image registration. Our method provides a principled way to characterize the true posterior distribution, thus, avoiding potential over-fitting. We demonstrated the added-value of our Bayesian unsupervised DL-based registration framework on the MNIST and brain MRI (MGH10) datasets in comparison to the VoxelMorph. Our experiments show that our approach provided better estimates of the deformation field by means of improved mean-squared-error (0.0063 vs. 0.0065) and Dice coefficient (0.73 vs. 0.71) for the MNIST and the MGH10 datasets, respectively. Further, it provides an estimate of the uncertainty in the deformation-field.
Embracing the Disharmony in Heterogeneous Medical Data
Rongguang Wang
Heterogeneity in medical imaging data is often tackled, in the context of machine learning, using domain invariance, i.e. deriving models that are robust to domain shifts, which can be both within domain (e.g. demographics) and across domains (e.g. scanner/protocol characteristics). However this approach can be detrimental to performance because it necessitates averaging across intra-class variability and reduces discriminatory power of learned models, in order to achieve better intra- and inter-domain generalization. This paper instead embraces the heterogeneity and treats it as a multi-task learning problem to explicitly adapt trained classifiers to both inter-site and intra-site heterogeneity. We demonstrate that the error of a base classifier on challenging 3D brain magnetic resonance imaging (MRI) datasets can be reduced by 2-3x, in certain tasks, by adapting to the specific demographics of the patients, and different acquisition protocols. Learning the characteristics of domain shifts is achieved via auxiliary learning tasks leveraging commonly available data and variables, e.g. demographics. In our experiments, we use gender classification and age regression as auxiliary tasks helping the network weights trained on a source site adapt to data from a target site; we show that this approach improves classification error by 5-30% across different datasets on the main classification tasks, e.g. disease classification.
Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN
Li Sun
Generative Adversarial Networks (GAN) have many potential medical imaging applications, including data augmentation, domain adaptation, and model explanation. Due to the limited embedded memory of Graphical Processing Units (GPUs), most current 3D GAN models are trained on low-resolution medical images. In this work, we propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by separating training and inference. During training, we adopt a hierarchical structure that simultaneously generates a low-resolution version of the image and a randomly selected sub-volume of the high-resolution image. The hierarchical design has two advantages: First, the memory demand for training on high-resolution images is amortized among subvolumes. Furthermore, anchoring the high-resolution subvolumes to a single low-resolution image ensures anatomical consistency between subvolumes. During inference, our model can directly generate full high-resolution images. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms baselines in quality of generated images.
Semantic Video Segmentation for Intracytoplasmic Sperm Injection Procedures
Peter He
We present the first deep learning model for the analysis of intracytoplasmic sperm injection (ICSI) procedures. Using a dataset of ICSI procedure videos, we train a deep neural network to segment key objects in the videos achieving a mean IoU of 0.962, and to localize the needle tip achieving a mean pixel error of 3.793 pixels at 14 FPS on a single GPU. We further analyze the variation between the dataset's human annotators and find the model's performance to be comparable to human experts.
Comparing Sparse and Deep Neural Network(NN)s: Using AI to Detect Cancer.
Charles Strauss
Human pathologists inspect pathology slides containing millions of cells but even experts disagree on diagnosis. While Deep learning has shown human pathologist level success on the task of tumor discovery, it is hard to decipher why a classification decision was reached. Previously, adversarial examples have been used to visualize the decision criteria employed by deep learning algorithms, and often demonstrate that classifications hinge on non-semantic features. Here, we demonstrate that adversarial examples to tumor detector NN models exist. We compare the relative robustness to adversarial examples of two types of autoencoders, based either on deep NNs or on sparse-coding. Our models consist of an autoencoder, whose latent representation is fed into a cell-level classifier. We attack the models with adversarial examples, analyze the attack, and test how these attacks transfer to the model it was not built for. We found that the latent representations of both types of autoencoders did well at reconstructing pathologist generated, pixel-level annotations and thus supported tumor detection at the cell level. Both models supported cell-level-classification AUC ROC scores of approximately 0.85 on holdout slides. Small (1%) adversarial perturbations were made to attack either model. Successful attacks on the deep model appeared to be random patterns (i.e. non-semantic), while successful attacks on the sparse model displayed cell-like features(i.e. potentially semantic). The deep model was attacked through the Fast Gradient Sign Method (FGSM), whereas we demonstrate a novel method for attacking the sparse model: run FGSM on a deep classifier that uses the sparse latent representation as its inputs and reconstructing an image from that attacked sparse latent representation. Adversarial examples made for one model did not successfully transfer to the opposite model, suggesting that the two classifiers use different criteria for classification.
Ultrasound Diagnosis of COVID-19: Robustness and Explainability
Jay Roberts
Diagnosis of COVID-19 at point of care is vital to the containment of the global pandemic. Point of care ultrasound (POCUS) provides rapid imagery of lungs to detect COVID-19 in patients in a repeatable and cost effective way. Previous work has focused on using a public dataset of POCUS videos to train an AI model for diagnosis that obtains high sensitivity. Due to the high stakes application we propose the use of robust and explainable techniques. We demonstrate experimentally that robust models have more stable predictions and offer improved interpretability. A framework of contrastive explanations based on adversarial perturbations is used to explain model predictions that aligns with human visual perception.
Decoding Brain States: Clustering fMRI Dynamic Functional Connectivity Timeseries with Deep Autoencoders
Arthur Spencer
In dynamic functional connectivity analysis, brain states can be derived by identifying repetitively occurring functional connectivity patterns. This presents a high-dimensional, unsupervised learning task, often approached with k-means clustering. To advance this, we use deep autoencoders for dimensionality reduction before applying k-means to the embedded space. We provide quantitative validation on synthetic data and demonstrate better performance than currently used approaches. We go on to demonstrate the utility of this method by applying it to real data from human subjects.
Encoding Clinical Priori in 3D Convolutional Neural Networks for Prostate Cancer Detection in bpMRI
Anindo Saha
We hypothesize that anatomical priors can be viable mediums to infuse domain-specific clinical knowledge into state-of-the-art convolutional neural networks (CNN) based on the U-Net architecture. We introduce a probabilistic population prior which captures the spatial prevalence and zonal distinction of clinically significant prostate cancer (csPCa), in order to improve its computer-aided detection (CAD) in bi-parametric MR imaging (bpMRI). To evaluate performance, we train 3D adaptations of the U-Net, U-SEResNet, UNet++ and Attention U-Net using 800 institutional training-validation scans, paired with radiologically-estimated annotations and our computed prior. For 200 independent testing bpMRI scans with histologically-confirmed delineations of csPCa, our proposed method of encoding clinical priori demonstrates a strong ability to improve patient-based diagnosis (upto 8.70% increase in AUROC) and lesion-level detection (average increase of 1.08 pAUC between 0.1–1.0 false positive per patient) across all four architectures.
LVHNet: Detecting Cardiac Structural Abnormalities with Chest X-Rays
Shreyas Bhave
Early identification of changes to heart size is critical to improving outcomes in heart failure. We introduce a deep learning model for detecting cardiac structural abnormality in chest X-rays. State of the art deep learning models focus on detecting cardiomegaly, a label consistently shown to be a poor marker for cardiac disease. Our method targets four major cardiac structural abnormalities -- left ventricular hypertrophy (LVH), severe LVH, dilated cardiomyopathy phenotype, and hypertrophic cardiomyopathy phenotype -- with performance superior to radiologist assessments. Furthermore, upon interrogation, we find our model's predictions are driven most strongly by structural features of the heart, confirming our model correctly focuses on the elements of chest X-rays pertinent to the diagnoses of cardiac structural abnormality.
Can We Learn to Explain Chest X-Rays?: A Cardiomegaly Use Case
Neil Jethani
In order to capitalize on the numerous applications of machine learning for medical imaging analysis, clinicians need to understand the clinical decisions made by machine learning (ML) models. This allows clinicians to trust ML models, understand their failure modes, and ideally learn from their superhuman capabilities and expand clinical knowledge. Providing explanations for each high resolution image in a large medical database can be computationally expensive. Recent methods amortize this cost by learning a selector model that takes a sample of data and selects the subset of its features that is important. We show that while the selector model learned by these methods make it simple for practitioners to explain new images, the model learns to counterintuitively encode predictions within its selections, omitting the important features. We demonstrate that this phenomenon can occur even with simple medical imaging tasks, such as detecting cardiomegaly in chest X-Rays. We propose REAL-X to address these issues and show that our method provides trustworthy explanations through quantitative and expert radiologist evaluation.
StND: Streamline-based Non-rigid partial-Deformation Tractography Registration
Bramsh Q Chandio
A brain pathway is digitally represented as a 3D line connecting an ordered sequence of 3D vector points called a streamline. Streamlines are generated by tractography methods applied on diffusion-weighted MRI. Direct alignment of white matter tractography/tracts is a crucial part of any diffusion MRI tractography based methods such as group analysis, tract segmentation, and tractometry analysis. In the past decade, several linear registration methods for streamline registration have been developed but the neuroimaging field still lacks robust methods for nonrigid streamline-based registration. In this paper, we introduce StND method for streamline-based partial-deformation registration. We formulate a registration problem for nonrigid registration of white matter tracts. In the StND, we first perform affine streamline-based linear registration (SLR) on white matter tracts and add a deformation step in it using the probabilistic non-rigid registration method called Coherent Point Drift. We model our collection of streamline data as a 3D point-set data and apply high-level deformations to better align tracts.
Unsupervised detection of Hypoplastic Left Heart Syndrome in fetal screening
Elisa Chotzoglou
Congenital heart disease is considered as one the most common congenital malfor- mation which affects 6% − 11% per 1000 newborns. In this work, an automated framework for detection of cardiac anomalies during ultrasound screening examina- tions is proposed and evaluated on the example of Hypoplastic Left Heart Syndrome, a sub-category of congenital heart disease. We propose an unsupervised approach that learns healthy anatomy exclusively from clinically confirmed normal control patients. We evaluate a number of known anomaly detection frameworks together with a model architecture based on the α-GAN network and find evidence that the proposed model shows a performance of 0.8 AUC and with a better robustness towards initialisation compared to individual state-of-the-art models.
COVIDNet-S: SARS-CoV-2 lung disease severity grading of chest X-rays using deep convolutional neural networks
Alexander Wong
Assessment of lung disease severity is a crucial step in the clinical workflow for patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause for the coronavirus disease 2019 (COVID-19) pandemic. A routine procedure for performing such an assessment involves analyzing chest x-rays (CXRs), with two key metrics being the extent of lung involvement and the degree of opacity. In this study, we introduce COVIDNet-S, a pair of deep convolutional neural networks based on the COVID-Net architecture for performing automatic geographic extent grading and opacity extent grading. We further introduce COVIDx-S, a benchmark dataset consisting of 396 CXRs from SARS-CoV-2 positive patient cases around the world, graded by two board-certified expert chest radiologists (with 20+ years of experience) and a 2nd-year radiology resident. To the best of our knowledge, this is the largest study as well as dataset of its kind for SARS-CoV-2 severity grading. Furthermore, this is the first study of its kind to make both models and dataset open access for the research community. Experimental results using 100-trial stratified Monte Carlo cross-validation (split between geographic and opacity extent) showed that the COVIDNet-S networks achieved R^2 of 0.664 +/- 0.001 and 0.635 +/- 0.002 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively, with the best performing COVIDNet-S networks achieving R^2 of 0.739 and 0.741 for geographic extent and opacity extent, respectively. These promising results illustrate the potential of leveraging deep convolutional neural networks for computer-aided assessment of SARS-CoV-2 lung disease severity.
AI system for predicting the deterioration of COVID-19 patients in the emergency department
Farah Shamout
During the COVID-19 pandemic, rapid and accurate triage of patients at the emergency department is critical to inform decision-making. We propose a data-driven approach for automatic prediction of deterioration risk using a deep neural network that learns from chest X-ray images, and a gradient boosting model that learns from routine clinical variables. Our AI prognosis system, trained using data from 3,661 patients, achieves an area under the AUC of 0.786 (95% CI: 0.742-0.827) when predicting deterioration within 96 hours. The deep neural network extracts informative areas of chest X-ray images to assist clinicians in interpreting the predictions, and performs comparably to two radiologists in a reader study. In order to verify performance in a real clinical setting, we silently deployed a preliminary version of the deep neural network at Anonymous Institution during the first wave of the pandemic, which produced accurate predictions in real-time. In summary, our findings demonstrate the potential of the proposed system for assisting front-line physicians in the triage of COVID-19 patients.
Annotation-Efficient Deep Semi-Supervised Learning for Automatic Knee Osteoarthritis Severity Diagnosis from Plain Radiographs
Hoang Nguyen
Osteoarthritis (OA) is a worldwide disease that occurs in joints causing irreversible damage to cartilage and other joint tissues. The knee is particularly vulnerable to OA, and millions of people, regardless of gender, geographical location, and race, suffer from knee OA. When the disease reaches the late stages, patients have to undergo a total knee replacement (TKR) surgery to avoid disability. For society, direct and indirect costs of OA are high, and for instance, OA is one of the five most expensive healthcare expenditures in Europe. In the United States, the burden of knee OA is also high, and TKR surgeries annually cost over 10 billion dollars. If knee OA could be detected at an early stage, its progression might be slowed down, thereby yielding significant benefits at personal and societal levels. Radiographs, low-cost and widely available in primary care, are sufficiently informative for knee OA severity diagnosis. However, the process of visual assessment of radiographs is rather tedious, and as a result, various Deep Learning (DL) based methods for automatic diagnosis of knee OA severity have recently been developed. The primary drawback of these methods is their dependency on large amounts of annotations, which are expensive in terms of cost and time to collect. In this paper, we introduce Semixup, a novel Semi-Supervised Learning (SSL) method, which we apply for to automatic diagnosis of the knee OA severity in an annotation-efficient manner.
A Deep Learning Model to Detect Anemia from Echocardiography
Weston Hughes
Computer vision models applied in medical imaging domains are capable of diagnosing diseases beyond what human physicians are capable of unassisted. This is especially the case in cardiology, where echocardiograms, electrocardiograms, and other imaging methods have been shown to contain large amounts of information beyond that described by simple clinical observation. Using 67,762 echocardiograms and temporally associated laboratory hemoglobin test results, we trained a video-based deep learning algorithm to predict abnormal lab values. On held-out test data, the model achieved an area under the curve (AUC) of 0.80 in predicting abnormal hemoglobin. We applied smoothgrad to further understand the features used by the model, and compared its performance with a linear model based on demographics and features derived from the echocardiogram. These results suggest that advanced algorithms can obtain additional value from diagnostic imaging and identify phenotypic information beyond the ability of expert clinicians.
Hip Fracture Risk Modeling Using DXA and Deep Learning
Peter Sadowski
The risk of hip fracture is predicted from dual-energy X-ray absorptiometry (DXA) images using deep learning and over 10,000 exams from the HealthABC longitudinal study. The approach is evaluated in four different clinical scenarios of increasing diagnostic intensity. In the scenario with the most information available, deep learning achieves an area under the ROC curve (AUC) of 0.75 on a held-out test set, while a standard linear model that relies on feature-engineering achieves an AUC of 0.72.
Classification with a domain shift in medical imaging
Alessandro Fontanella
Labelled medical imaging datasets are often small in size, but other unlabelled datasets with a domain shift may be available. In this work, we propose a method that is able to exploit these additional unlabelled data, possibly with a domain shift, to improve predictions on our labelled data. To this aim, we learn features in a self-supervised way while projecting all the data onto the same space to achieve better transfer. We first test our approach on natural images and verify its effectiveness on Office-31 data. Then, we apply it to retinal fundus datasets and through a series of experiments on age-related macular degeneration (AMD) and diabetic retinopathy (DR) grading, we show how our method improves the baseline of pre-training on ImageNet and fine-tuning on the labelled data in terms of classification accuracy, AUC and clinical interpretability.
3D UNet with GAN discriminator for robust localisation of the fetal brain and trunk in MRI with partial coverage of the fetal body
Alena Uus
In fetal MRI, automated localisation of the fetal brain or trunk is a prerequisite for motion correction methods. However, the existing CNN-based solutions are prone to errors and may require manual editing. In this work, we propose to combine a multi-label 3D UNet with a GAN discriminator for localisation of both fetal brain and trunk in fetal MRI stacks. The proposed method is robust for datasets with both full and partial coverage of the fetal body.
Biomechanical modelling of brain atrophy through deep learning
Mariana da Silva
We present a proof-of-concept, deep learning (DL) based, differentiable biomechanical model of realistic brain deformations. Using prescribed maps of local atrophy and growth as input, the network learns to deform images according to a Neo-Hookean model of tissue deformation. The tool is validated using longitudinal brain atrophy data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, and we demonstrate that the trained model is capable of rapidly simulating new brain deformations with minimal residuals. This method has the potential to be used in data augmentation or for the exploration of different causal hypotheses reflecting brain growth and atrophy.
Deep Learning extracts novel MRI biomarkers for Alzheimer’s disease progression
Yi Li
Case/control Genome-wide association studies (GWAS) for late onset Alzheimer's disease (AD) may miss genetic variants relevant for delineating disease stages when using clinically defined case/control as a phenotype since the cases highlight advanced AD and widely heterogeneous mild cognitive impairment patients are usually excluded. More precise phenotypes for AD are in demand. Here we use a transfer learning technique to train three-dimensional convolutional neural network (CNN) models based on structural Magnetic Resonance Images (MRI) from the screening stage in the ADNI consortium to derive image features that reflect AD progression. CNN-derived image phenotypes are significantly associated with genetic variants mapped to candidate genes enriched for amyloid beta degradation, tau phosphorylation, calcium ion binding-dependent synaptic loss, APP regulated inflammation response, and insulin resistance. This is the first attempt to show that non-invasive MRI biomarkers are linked to AD progression characteristics, reinforcing their utilizations in early AD diagnosis and progression monitoring.
Towards disease-aware image editing of chest X-rays
Aakash Saboo
Disease-aware image editing by means of generative adversarial networks (GANs) constitutes a promising avenue for advancing the use of AI in the healthcare sector - we present a Proof of Concept of the same. While GAN-based techniques have been successful in generating and manipulating natural images, their application to the medical domain, however, is still in its infancy. Working with the CheXpert data set, here we show that StyleGAN can be trained to generate realistic chest X-rays. Inspired by the Cyclic Reverse Generator (CRG) framework, we train an encoder that allows for faithfully inverting the generator on synthetic X-rays and provides organ-level reconstructions of real ones. Employing a guided manipulation of latent codes, we confer the medical condition of Cardiomegaly (increased heart size) onto real X-rays from healthy patients
Learning MRI contrast agnostic registration
Malte Hoffmann, Adrian Dalca
We introduce a strategy for learning image registration without imaging data, producing powerful networks agnostic to magnetic resonance imaging (MRI) contrast. While classical methods accurately estimate the spatial correspondence between images, they solve an optimization problem for every new image pair. Learning methods are fast at test time but limited to images with contrasts and geometric content seen at training. We propose to remove this dependency using a generative strategy that exposes networks to a wide range of synthetic images during training, forcing them to generalize. We show that networks trained within this framework generalize to a broad array of unseen MRI contrasts and surpass the state of the art brain registration accuracy for any contrast combination tested. Critically, training on shapes synthesized from noise distributions results in competitive performance, removing the dependency on acquired data of any kind. However, if available, synthesizing images from anatomical labels can further boost accuracy.
A Critic Evaluation Of Covid-19 Automatic Detection From X-Ray Images
Gianluca Maguolo
In this paper, we compare and evaluate different testing protocols used for automatic COVID-19 diagnosis from X-Ray images in the recent literature. We show that similar results can be obtained using X-Ray images that do not contain most of the lungs. We are able to remove the lungs from the images by turning to black the center of the X-Ray scan and training our classifiers only on the outer part of the images. Hence, we deduce that several testing protocols for the recognition are not fair and that the neural networks are learning patterns in the dataset that are not correlated to the presence of COVID-19. Finally, we show that creating a fair testing protocol is a challenging task, and we provide a method to measure how fair a specific testing protocol is. In the future research we suggest to check the fairness of a testing protocol using our tools and we encourage researchers to look for better techniques than the ones that we propose.
RATCHET: Medical Transformer for Chest X-ray Diagnosis and Reporting
Benjamin Hou
Chest X-rays are one of the most common form of radiological examinations. They are relatively inexpensive and quick to perform. However, the ability to interpret a radiograph may take years of training for a highly skilled practitioner. Automatic generation of radiology reports, therefore, can be an attractive method to support clinical pathways and patient care. In this work, we present RATCHET: RAdiological Text Captioning for Human Examined Thoraxes. RATCHET is trained on free-text radiology reports from the MIMIC-CXR dataset, which demonstrated to be highly linguistic fluent whilst being clinically accurate.
Autoencoder Image Compression Algorithm for Reduction of Resource Requirements
Fred Kwon
Exponentially increasing amounts of compute resources are used in the start of the art machine learning (ML) models. We designed a lightweight medical imaging compression machine learning algorithm with preserved diagnostic utility. Our compression algorithm was a two-level, vector quantized variational autoencoder (VQ-VAE-2). We trained our algorithm in a self-supervised manner with CheXpert radiographs and externally validated with previously unseen MIMIC-CXR radiographs. We also used the compressed latent vectors or the reconstructed CheXpert images as inputs to train a DenseNet-121 classifier. VQ-VAE achieved 2.5 times the compression ratio with similar Fréchet inception distance as that of the current JPEG2000 standard. The classifier trained on latent vectors has similar AUROC as that of the model trained on original images. Model training with latent vectors required 6.2% of memory and compute and 48.5% time per epoch compared to training with original images. Autoencoders can decrease resource requirements for future ML research.
Learning to estimate a surrogate respiratory signal from cardiac motion by signal-to-signal translation
Akshay Iyer
In this work, we develop a neural network-based method to convert a noisy motion signal generated from segmenting cardiac SPECT images, to that of a high-quality surrogate signal, such as those seen from external motion tracking systems (EMTs). This synthetic surrogate will be used as input to our pre-existing motion correction technique developed for EMT surrogate signals. In our method, we test two families of neural networks to perform signal-to-signal translation (noisy internal motion to external surrogate): 1) fully connected networks and 2) convolutional neural networks. Our dataset consists of cardiac perfusion SPECT acquisitions for which cardiac motion was estimated (input: COM signals) in conjunction with a respiratory surrogate motion signal acquired using a commercial Vicon Motion Tracking System (GT: EMT signals). We obtain an r-score of 0.74 between the predicted surrogate and the EMT signal and our goal is to lay a foundation to guide the optimization of neural networks for respiratory motion correction from SPECT without the need for an EMT.
Multi-Label Incremental Few-Shot Learning for Medical Image Pathology classifiers
Laleh Seyyed-Kalantari
Deep learning models for medical image classification are typically trained to predict pre-determined radiological findings and cannot incorporate a novel disease at test time efficiently. Retraining an entirely new classifier is often out of question due to insufficient novel data or lack of compute/base disease data. Thus, learning fast adapting models is essential. Few-shot learning has shown promise for efficient adaptation to new classes at test time, but literature revolves primarily around single-label natural images distributed over a large number of different classes. However, this setting differs notably from the medical imaging domain, where images are multilabel, of fewer total categories, and retention of base label predictions is desired. In this paper, we study incremental few-shot learning for low- and multilabel medical image data to address the problem of learning a novel disease with few finetuning samples while retaining knowledge over base findings. We show strong performance on incrementally learned novel disease labels for chest X-rays with strong performance retention on base classes.
Diffusion MRI-based structural connectivity robustly predicts "brain-age''
Guruprasath Gurusamy
Neuroimaging-based biomarkers of brain health are necessary for early diagnosis of cognitive decline in the aging population. While many recent studies have investigated whether an individual's "brain-age'' can be accurately predicted based on anatomical or functional brain biomarkers, comparatively few studies have sought to predict brain-age with structural connectivity features alone. Here, we investigated this question with data from a large cross-sectional study of elderly volunteers in India (n=158 participants, age-range=51-86 yrs, 66 females). We analyzed 23 standardized cognitive test scores obtained from these participants with factor analysis. All test score variations could be explained with just three latent cognitive factors, each of which declined markedly with age. Next, using diffusion magnetic resonance imaging (dMRI) and tractography we estimated the structural brain connectome in a subset of n=101 individuals. Structural connectivity features robustly predicted inter-individual variations in cognitive factor scores (r=0.293-0.407, p<0.001) and chronological age (r=0.517-0.535, p<0.001), and identified critical connections in the prefrontal and parietal cortex whose strength most strongly predicted each of these variables. dMRI structural connectivity may serve as a reliable tool for predicting age-related cognitive decline in healthy individuals, as well as accelerated decline in patient populations.
RANDGAN: Randomized Generative Adversarial Network for Detection of COVID-19 in Chest X-ray
Sam Motamed
Automation of COVID-19 testing using medical images can speed up the testing process of patients where health care systems lack sufficient numbers of the reverse-transcription polymerase chain reaction (RT-PCR) tests. Supervised deep learning models such as convolutional neural networks (CNN) need enough labeled data for all classes to correctly learn the task of detection. Gathering labeled data is a cumbersome task and requires time and resources which could further strain health care systems and radiologists at the early stages of a pandemic such as COVID-19. In this study, we propose a randomized generative adversarial network (RANDGAN) that detects images of an unknown class (COVID-19) from known and labelled classes (Normal and Viral Pneumonia) without the need for labels and training data from the unknown class of images (COVID-19). We used the largest publicly available COVID-19 chest X-ray dataset, COVIDx, which is comprised of Normal, Pneumonia, and COVID-19 images from multiple public databases. In this work, we use transfer learning to segment the lungs in the COVIDx dataset. Next, we show why segmentation of the region of interest (lungs) is vital to correctly learn the task of classification, specifically in datasets that contain images from different resources as it is the case for the COVIDx dataset. Finally, we show improved results in detection of COVID-19 cases using our generative model (RANDGAN) compared to conventional generative adversarial networks (GANs) for anomaly detection in medical images, improving the area under the ROC curve from 0.71 to 0.77.
Harmonization and the Worst Scanner Syndrome
Daniel Moyer
We show that for a wide class of harmonization/domain-invariance schemes several undesirable properties are unavoidable. If a predictive machine is made invariant to a set of domains, the accuracy of the output predictions (as measured by mutual information) is limited by the domain with the least amount of information to begin with. If a real label value is highly informative about the source domain, it cannot be accurately predicted by an invariant predictor. These results are simple and intuitive, but we believe that it is beneficial to state them for medical imaging harmonization.
MVD-Fuse: Detection of White Matter Degeneration via Multi-View Learning of Diffusion Microstructure
Shreyas Fadnavis
Detecting neuro-degenerative disorders in early-stage and asymptomatic patients is challenging. Diffusion MRI (dMRI) has shown great success in generating biomarkers for cellular organization at the microscale level using complex biophysical models, but there has never been a consensus on a clinically usable standard model. Here, we propose a new framework (MVD-Fuse) to integrate measures of diverse diffusion models to detect alterations of white matter microstructure. The spatial maps generated by each measure are considered as a different diffusion representation (view), the fusion of these views being used to detect differences between clinically distinct groups. We investigate three different strategies for performing intermediate fusion: neural networks (NN), multiple kernel learning (MKL) and multi-view boosting (MVB). As a proof of concept, we applied MVD-Fuse to a classification of premanifest Huntington's disease (pre-HD) individuals and healthy controls in the TRACK-ON cohort. Our results indicate that the MVD-Fuse boosts predictive power, especially with MKL (0.90 AUC vs 0.85 with the best single diffusion measure). Overall, our results suggest that an improved characterization of pathological brain microstructure can be obtained by combining various measures from multiple diffusion models.
Zero-dose PET Reconstruction with Missing Input by U-Net with Attention Modules
Jiahong Ouyang
Positron emission tomography (PET) is a widely used molecular imaging technique with many clinical applications. To obtain high quality images, the amount of injected radiotracer in current protocols leads to the risk of radiation exposure in scanned subjects. Recently, deep learning has been successfully used to enhance the quality of low-dose PET images. Extending this to "zero-dose," i.e., predicting PET images based solely on data from other imaging modalities such as multimodal MRI, is significantly more challenging but also much more impactful. In this work, we propose a attention-based framework that uses multi-contrast MRI to reconstruct PET images using the most commonly-used radiotracer, 18F-fluorodeoxyglucose (FDG), a marker of metabolism. We also introduce an input dropout training strategy to handle possible missing MRI contrasts. We evaluate our methods on a dataset of patients with brain tumors, showing the ability to create realistic and clinically-meaningful FDG brain PET images with low errors compared with full-dose ground truth PET images.
Predicting the Need for Intensive Care for COVID-19 Patients using Deep Learning on Chest Radiography
Isabelle Hu
In this study, we propose an artificial intelligent (AI) COVID-19 prognosis method to predict patients’ needs for intensive care by analyzing chest radiography (CXR) images using deep learning. The dataset consisted of the CXR exams of 1178 COVID-19 positive patients as confirmed by reverse transcription polymerase chain reaction tests for the SARS-CoV-2 virus, 20% of which were held out for testing. Our model was based on DenseNet121 and a curriculum learning technique was employed to train on a sequence of gradually more specific and complex tasks: 1) fine-tuning a model pretrained on ImageNet using a previously established CXR dataset with a broad spectrum of pathologies, 2) refining on another established dataset to detect pneumonia, and 3) fine-tuning on our training/validation dataset to predict patients’ needs for intensive care within 24, 48, 72, and 96 hours following the CXR exams. The classification performances were evaluated on the independent test set using the area under the receiver operating characteristic curve (AUC) as the performance metric in the task of distinguishing between those COVID-19-positive patients who required intensive care and those who did not. We achieved an AUC [95% confidence interval] of 0.77 [0.70, 0.84] when predicting the need for intensive care 24 hours in advance, and at least 0.73 [0.66, 0.80] for earlier predictions based on the AI prognostic marker derived from CXR images.
Community Detection in Medical Image Datasets: Using Wavelets and Spectral Clustering
Roozbeh Yousefzadeh
Medical image datasets can have large number of images representing patients with different health conditions and various disease severity. When dealing with raw unlabeled image datasets, the large number of samples often makes it hard for non-experts to understand the variety of images present in a dataset. Supervised learning methods rely on labeled images which requires a considerable effort by medical experts to first understand the communities of images present in the data and then labeling the images. Here, we propose an algorithm to facilitate the automatic identification of communities in medical image datasets. We further explain that such analysis can also be insightful in a supervised setting, when the images are already labeled. Such insights are useful because, in reality, health and disease severity can be considered a continuous spectrum, and within each class, there usually are finer communities worthy of investigation, especially when they have similarities to communities in other classes. In our approach, we use wavelet decomposition of images in tandem with spectral methods. We show that the eigenvalues of a graph Laplacian can reveal the number of notable communities in an image dataset. In our experiments, we use a dataset of images labeled with different conditions for COVID patients. We detect 25 communities in the dataset and then observe that only 5 of those communities contain patients with pneumonia.
Semi-Supervised Learning of MR Image Synthesis without Fully-Sampled Ground-Truth Acquisitions
Mahmut Yurt
In this study, we present a novel semi-supervised generative model for multi-contrast MRI that synthesizes high-quality images without requiring large training sets of costly fully-sampled images of source or target contrasts. To do this, the proposed method introduces a selective loss expressed only in the available k-space coefficients, and further leverages randomized sampling trajectories across training subjects to effectively learn relationships between acquired and nonacquired k-space samples at all locations. Comprehensive experiments on multi-contrast brain images clearly demonstrate that the proposed method maintains equivalent performance to gold-standard model based on fully-supervised training, while alleviating undesirable dependency on large-scale fully-sampled MRI acquisitions.
Probabilistic Recovery of Missing Phase Images in Contrast-Enhanced CT
Dhruv Patel
Contrast-Enhanced CT (CECT) imaging is used in the diagnosis of renal cancer and planning of surgery. Often, some CECT phase images are either completely missing or are corrupted with external noise making them useless. We propose a probabilistic deep generative model for imputing missing phase images in a sequence of CECT image. Our proposed model recovers the missing phase images with quantified uncertainty estimates enabling medical decision-makers make better-informed decisions. Furthermore, we propose a novel style-based adversarial loss to learn very fine-scale features unique to CECT imaging resulting in better recovery. We demonstrate the efficacy of this algorithm using a patient dataset collected in an IRB-approved retrospective study.
Scalable solutions for MR image classification of Alzheimer's disease
Sarah Brueningk
Magnetic resonance imaging is one of the flagship techniques for non-invasive medical diagnosis. Yet, high-resolution three-dimensional (3D) imaging poses a challenge on machine learning applications: how to determine the optimal trade-off between computational cost and imaging details retained? Here, we present two scalable approaches for image classification relying on topological data analysis and ensemble classification on parallelized 3D convolutional neural networks. We demonstrate the applicability of our models on a classification task of MR images of Alzheimer's disease patients and cognitively normal subjects. Our approaches achieve competitive results in terms of area under the precision recall curve (0.95+/-0.03).
Attention Transfer Outperforms Transfer Learning in Medical Image Disease Classifiers
Sina Akbarian
Convolutional neural networks (CNN) are widely used in medical images diagnostic. However, training the CNNs is prohibitive in a low-data environment. In this study, for the low-data medical image domain, we propose a novel knowledge transfer approach to facilitate the training of CNNs. Our approach adopts the attention transfer framework to transfer knowledge from a carefully pre-trained CNN teacher to a student CNN. The performance of the CNN models is then evaluated on three medical image datasets including Diabetic Retinopathy, CheXpert, and ChestX-ray8. We compare our results with the well-known and widely used transfer learning approach. We show that the teacher-student (Attention transfer) framework not only outperforms transfer learning, in both in-domain and cross-domain knowledge transfer but also behave as a regularizer.
Retrospective Motion Correction of MR Images using Prior-Assisted Deep Learning
Soumick Chatterjee
In MRI, motion artifacts are among the most common types of artefacts. They can greatly degrade images and make them unusable for an accurate diagnosis. Traditional methods, such as prospective or retrospective motion correction, are commonly used to avoid or limit the presence of motion artifacts. Recently, several other methods based on deep learning approaches have been proposed to solve this problem. This work tries to enhance the performance of existing deep learning models by making use of additional information present as image priors. The proposed approach has shown promising results and will be further investigated for clinical validity.
Modified VGG16 Network for Medical Image Analysis
Amulya Vatsavai
Thoracic diseases, like pneumonia and emphysema, affect millions of people around the globe every year. Chest radiography is essential to detecting and treating these diseases. Manually interpreting radiographical images is a time-consuming and fatiguing task. In regions without enough access to radiologists or radiographical equipment, the inability to analyze these images adversely affects patient care. Recent deep learning based thoracic disease classification using X-Ray images has been shown to perform on par with expert radiologists in interpreting medical images. The purpose of this study is to compare the transfer learning performance of different deep learning algorithms on their detection of thoracic pathologies in chest radiographs. In addition, we present a simple modification to the well-known VGG16 network to overcome overfitting. Comparative analysis shows that careful utilization of pretrained networks may provide a good alternative to specialized handcrafted networks due to the lack of sufficient labeled images in the medical domain.
Adversarial cycle-consistent synthesis of cerebral microbleeds for data augmentation
Khrystyna Faryna
We propose a novel framework for controllable pathological image synthesis for data augmentation. Inspired by CycleGAN, we perform cycle-consistent image-to-image translation between two domains: healthy and pathological. Guided by a semantic mask, an adversarially trained generator synthesizes pathology on a healthy image in the specified location. We demonstrate our approach on an institutional dataset of cerebral microbleeds in traumatic brain injury patients. We utilize synthetic images generated with our method for data augmentation in the detection of cerebral microbleeds detection. Enriching the training dataset with synthetic images exhibits the potential to increase detection performance for cerebral microbleeds in traumatic brain injury patients.
Self-supervised out-of-distribution detection in brain CT scans
Seong Tae Kim
Medical imaging data suffer from the limited availability of annotation because annotating 3D medical data is a time-consuming and expensive task. Moreover, even if the annotation is available, supervised learning-based approaches suffer highly imbalanced data. Most of the scans during the screening are from normal subjects, but there are also large variations in abnormal cases. To address these issues, recently, unsupervised deep anomaly detection methods that train the model on large-sized normal scans and detect abnormal scans by calculating reconstruction error have been reported. In this paper, we propose a novel self-supervised learning technique for anomaly detection. Our architecture largely consists of two parts: 1) Reconstruction and 2) predicting geometric transformations. By training the network to predict geometric transformations, the model could learn better image features and distribution of normal scans. In the test time, the geometric transformation predictor can assign the anomaly score by calculating the error between geometric transformation and prediction. Moreover, we further use self-supervised learning with context restoration for pretraining our model. By comparative experiments on clinical brain CT scans, the effectiveness of the proposed method has been verified.
Improving Interpretability in Medical Imaging Diagnosis using Adversarial Training
Andrei Margeloiu
We investigate the influence of adversarial training on the interpretability of convolutional neural networks (CNNs), specifically applied to diagnosing skin cancer. We show that gradient-based saliency maps of adversarially trained CNNs are significantly sharper and more visually coherent than those of standardly trained CNNs. Furthermore, we show that adversarially trained networks highlight regions with significant color variation within the lesion, a common characteristic of melanoma. We find that fine-tuning a robust network with a small learning rate further improves saliency maps' sharpness. Lastly, we provide preliminary work suggesting that robustifying the first layers to extract robust low-level features leads to visually coherent explanations.