Timezone: »

Workshop
Learning from Time Series for Health
Sana Tonekaboni · Thomas Hartvigsen · Satya Narayan Shukla · Gunnar Rätsch · Marzyeh Ghassemi · Anna Goldenberg

Fri Dec 02 07:00 AM -- 03:00 PM (PST) @ Room 392

Time series data are ubiquitous in healthcare, from medical time series to wearable data, and present an exciting opportunity for machine learning methods to extract actionable insights about human health. However, huge gap remain between the existing time series literature and what is needed to make machine learning systems practical and deployable for healthcare. This is because learning from time series for health is notoriously challenging: labels are often noisy or missing, data can be multimodal and extremely high dimensional, missing values are pervasive, measurements are irregular, data distributions shift rapidly over time, explaining model outcomes is challenging, and deployed models require careful maintenance over time. These challenges introduce interesting research problems that the community has been actively working on for the last few years, with significant room for contribution still remaining. Learning from time series for health is a uniquely challenging and important area with increasing application. Significant advancements are required to realize the societal benefits of these systems for healthcare. This workshop will bring together machine learning researchers dedicated to advancing the field of time series modeling in healthcare to bring these models closer to deployment.

 Fri 7:00 a.m. - 7:15 a.m. Opening remarks (In-person intro) 🔗 Fri 7:15 a.m. - 7:45 a.m. Invited speaker - Yan Liu, USC (Talk) 🔗 Fri 7:45 a.m. - 8:15 a.m. Invited speaker - Stephanie Hyland, Microsoft Research UK (Talk) 🔗 Fri 8:15 a.m. - 8:30 a.m. Coffee break (Break) 🔗 Fri 8:30 a.m. - 9:00 a.m. Invited speaker - Danielle Belgrave, DeepMind UK (Talk) 🔗 Fri 9:00 a.m. - 9:30 a.m. Spotlight talks for poster session I (Spotlight) 🔗 Fri 9:00 a.m. - 9:30 a.m. Performance and utility trade-off in interpretable sleep staging (Spotlight)  link »    Recent advances in deep learning have led to the development of models approaching human level of accuracy. However, healthcare remains an area lacking in widespread adoption. The safety-critical nature of healthcare results in a natural reticence to put these black-box deep learning models into practice. In this paper, we explore interpretable methods for a clinical decision support system, sleep staging, based on physiological signals such as EEG, EOG, and EMG. A recent work has shown sleep staging using simple models and an exhaustive set of features can perform nearly as well as deep learning approaches but only for certain datasets. Moreover, the utility of these features from a clinical standpoint is unclear. On the other hand, the proposed framework, NormIntSleep shows that by representing deep learning embeddings using normalized features, great performance can be obtained across different datasets. NormIntSleep performs 4.5% better than the exhaustive feature-based approach and 1.5% better than other representation learning approaches. An empirical comparison between the utility of the interpretations of these models highlights the improved alignment with clinical expectations when performance is traded-off slightly. Link » Irfan Al-Hussaini · Cassie Mitchell 🔗 Fri 9:00 a.m. - 9:30 a.m. Learning Absorption Rates in Glucose-Insulin Dynamics from Meal Covariates (Spotlight)  link »    Traditional models of glucose-insulin dynamics rely on heuristic parameterizations chosen to fit observations within a laboratory setting. However, these models cannot describe glucose dynamics in daily life. One source of failure is in their descriptions of glucose absorption rates after meal events. A meal's macronutritional content has nuanced effects on the absorption profile, which is difficult to model mechanistically. In this paper, we propose to learn the effects of macronutrition content from glucose-insulin data and meal covariates. Given macronutrition information and meal times, we use a neural network to predict an individual's glucose absorption rate. We use this neural rate function as the control function in a differential equation of glucose dynamics, enabling end-to-end training. On simulated data, our approach is able to closely approximate true absorption rates, resulting in better forecast than heuristic parameterizations, despite only observing glucose, insulin, and macronutritional information. Our work readily generalizes to meal events with higher-dimensional covariates, such as images, setting the stage for glucose dynamics models that are personalized to each individual's daily life. Link » Ke Alexander Wang · Matthew Levine · Jiaxin Shi · Emily Fox 🔗 Fri 9:00 a.m. - 9:30 a.m. Real-world Challenges in Leveraging Electrocardiograms for Coronary Artery Disease Classification (Spotlight)  link »    This work investigates coronary artery disease (CAD) prediction from electrocardiogram (ECG) data taking into account different windows with respect to the time of diagnosis. We report that ECG waveform measurements automatically collected during ECG recordings contain sufficient features for good classification of CAD using machine learning models up to five years before diagnosis. On the other hand, convolutional neural networks trained on the ECG signals themselves appear to best extract CAD related features when processing data collected one year after a diagnosis is made. Through this work we demonstrate that the type of ECG data and the time window with respect to diagnosis should guide model selection. Link » Jessica De Freitas · Alexander Charney · Isotta Landi 🔗 Fri 9:00 a.m. - 9:30 a.m. Empirical Evaluation of Data Augmentations for Biobehavioral Time Series Data with Deep Learning (Spotlight)  link »    Deep learning has performed remarkably well on many tasks recently. However, the superior performance of deep models relies heavily on the availability of a large number of training data, which limits the wide adaptation of deep models on various clinical and affective computing tasks, as the labeled data are usually very limited. As an effective technique to increase the data variability and thus train deep models with better generalization, data augmentation (DA) is a critical step for the success of deep learning models on biobehavioral time series data. However, the effectiveness of various DAs for different datasets with different tasks and deep models is understudied for biobehavioral time series data. In this paper, we first systematically review eight basic DA methods for biobehavioral time series data, and evaluate the effects on seven datasets with three backbones. Next, we explore adapting more recent DA techniques ($\textit{i.e., automatic augmentation, random augmentation}$) to biobehavioral time series data. Last, we try to answer the question of why a DA is effective ($\textit{or not}$) by first summarizing two desired attributes for augmentations ($\textit{challenging}$ and $\textit{faithful}$), and then utilizing two metrics to quantitatively measure the corresponding attributes. We find that an effective DA needs to generate challenging but still faithful transformations, which can guide us in the search for more effective DA for biobehavioral time series data. Link » Huiyuan Yang · Han Yu · Akane Sano 🔗 Fri 9:00 a.m. - 9:30 a.m. Supervised change-point detection with dimension reduction, applied to physiological signals (Spotlight)  link »    This paper proposes an automatic method to calibrate change point detection algorithms for high-dimensional time series. Our procedure builds on the ability of an expert (e.g. a medical researcher) to produce approximate segmentation estimates, called partial annotations, for a small number of signal examples. This contribution is a supervised approach to learn a diagonal Mahalanobis metric, which, once combined with a detection algorithm, is able to reproduce the expert's segmentation strategy on out-of-sample signals. Unlike previous works for change detection, our method includes a sparsity-inducing regularization which perform supervised dimension selection, and adapts to partial annotations. Experiments on activity signals collected from healthy and neurologically impaired patients support the fact that supervision markedly ameliorate detection accuracy. Link » Charles Truong · Laurent Oudre 🔗 Fri 9:00 a.m. - 9:30 a.m. An Electrocardiogram-Based Risk Score for Cardiovascular Mortality (Spotlight)  link »    The electrocardiogram (ECG) is the most frequently performed cardiovascular diagnostic test, but it is unclear how much information resting ECGs contain about long term cardiovascular risk. Using a dataset of 312,422 resting 12-lead ECGs collected at [Medical Center 1, redacted for anonymity], we developed SEER, the preScient Estimator of Electrocardiogram Risk. SEER predicts five-year cardiovascular mortality with an area under the receiver operator characteristic curve (AUC) of 0.83 in a held-out test set at [Medical Center 1], and with AUCs of 0.79 and 0.83 when independently evaluated at [Medical Center 2] and [Medical Center 3] respectively. SEER predicts 5-year atheroscleroitc disease (ASCVD) with an AUC of 0.67 and is close in performance to the Pooled Cohort Equations for ASCVD Risk while being only modestly correlated. SEER has the potential to provide value by stratifying patients beyond current clinical practice. Link » John Hughes · David Ouyang · Pierre Elias · James Zou · Euan Ashley · Marco Perez 🔗 Fri 9:00 a.m. - 9:30 a.m. A Framework for the Evaluation of Clinical Time Series Models (Spotlight)  link »    Early detection of critical events is one of the mainstays of clinical time series prediction tasks. As data from electronic health records become larger in volume and availability increases, models that can predict critical events before they occur and inform clinical decision making have the potential to transform aspects of clinical care. There has been a recent surge in literature looking at early detection in the context of clinical time series. However, methods used to evaluate clinical time series models in which multiple predictions per time series are made often do not adequately measure the utility of the models in the clinical setting. Classical metrics such as the Area Under the Receiver Operating Characteristic (AUROC) and the Area Under the Precision Recall Curve (AUPRC) fail to fully capture the true, real-world performance of these models. In this work, we i)propose a method to evaluate early prediction models in a way that is consistent with their application in the clinical setting, and ii) provide a fast, open-source, and native cross-platform implementation. Link » Michael Gao · Jiayu Yao · Ricardo Henao 🔗 Fri 9:00 a.m. - 9:30 a.m. Fair Multimodal Checklists for Interpretable Clinical Time Series Prediction (Spotlight)  link »    Checklists are interpretable and easy-to-deploy models often used in real-world clinical decision-making. Prior work has demonstrated that checklists can be learned from binary input features in a data-driven manner by formulating the training objective as an integer programming problem. In this work, we learn diagnostic checklists for the task of phenotype classification with time series vitals data of ICU patients from the MIMIC-IV dataset. For 13 clinical phenotypes, we fully explore the empirical behavior of the checklist model in regard to multimodality, time series dynamics, and fairness. Our results show that the addition of the imaging data modality and the addition of shapelets that capture time series dynamics can significantly improve predictive performance. Checklist models optimized with explicit fairness constraints achieve the target fairness performance, at the expense of lower predictive performance. Link » Qixuan Jin · Haoran Zhang · Thomas Hartvigsen · Marzyeh Ghassemi 🔗 Fri 9:00 a.m. - 9:30 a.m. Personalized Dose Guidance using Safe Bayesian Optimization (Spotlight)  link » This work considers the problem of personalized dose guidance using Bayesian optimization that learns the optimum drug dose tailored to each individual, thus improving therapeutic outcomes. Safe learning using interior point method guarantees safety with high probability. This is demonstrated using the problem of learning the optimum bolus insulin dose in patients with type 1 diabetes to counteract the effect of meal consumption. Starting from no a priori information about the patients, our dose guidance algorithm is able to improve the therapeutic outcome (measured in terms of % time-in-range) without jeopardizing patient safety. Other potential healthcare applications are also discussed. Link » 🔗 Fri 9:30 a.m. - 10:30 a.m. Poster Session I (Poster Session) 🔗 Fri 9:30 a.m. - 10:30 a.m. Performance and utility trade-off in interpretable sleep staging (Poster)  link » Recent advances in deep learning have led to the development of models approaching human level of accuracy. However, healthcare remains an area lacking in widespread adoption. The safety-critical nature of healthcare results in a natural reticence to put these black-box deep learning models into practice. In this paper, we explore interpretable methods for a clinical decision support system, sleep staging, based on physiological signals such as EEG, EOG, and EMG. A recent work has shown sleep staging using simple models and an exhaustive set of features can perform nearly as well as deep learning approaches but only for certain datasets. Moreover, the utility of these features from a clinical standpoint is unclear. On the other hand, the proposed framework, NormIntSleep shows that by representing deep learning embeddings using normalized features, great performance can be obtained across different datasets. NormIntSleep performs 4.5% better than the exhaustive feature-based approach and 1.5% better than other representation learning approaches. An empirical comparison between the utility of the interpretations of these models highlights the improved alignment with clinical expectations when performance is traded-off slightly. Link » Irfan Al-Hussaini · Cassie Mitchell 🔗 Fri 9:30 a.m. - 10:30 a.m. PiRL: Participant-Invariant Representation Learning for Health Care using Wearable Data (Poster)  link » Due to the individual heterogeneities among human subjects, researchers observed performance gaps between generic (one-size-fits-all) models and person-specific models in data-driven health applications. However, in real-world applications, generic models are usually more favored due to the factors such as the new-user-adaptation issue, system complexities, etc. To improve the performance of the generic model, we propose a representation learning framework that learns participant-invariant, named PiRL. The proposed framework constrains the latent space using maximum mean discrepancy (MMD) to close the distribution gap among subjects. Further, a triplet loss is utilized to optimize the learned representations for downstream health applications. We evaluate our frameworks on two public datasets for human physical and mental health problems detecting sleep apnea and stress, respectively. As preliminary results, we found the proposed approach shows around a 5\% increase in accuracy with statistical differences compared to baseline. Link » Zhaoyang Cao · Han Yu · Huiyuan Yang · Akane Sano 🔗 Fri 9:30 a.m. - 10:30 a.m. Dynamic outcomes-based clustering of disease progression in mechanically ventilated patients (Poster)  link » The advancement of Electronic Health Records (EHRs) and machine learning have enabled a data-driven and personalised approach to healthcare. One step in this direction is to uncover patient sub-types with similar disease trajectories in a heterogeneous population. This is especially important in the context of mechanical ventilation in intensive care, where mortality is high and there is no consensus on treatment. In this work, we present a new approach to clustering mechanical ventilation episodes, using a multi-task combination of supervised, self-supervised and unsupervised learning techniques. Our dynamic clustering assignment is explicitly guided to reflect the phenotype, trajectory and outcomes of the patient. Experimentation on a real-world dataset is encouraging, and we hope that we could someday translate this into actionable insights in guiding future clinical research. Link » Emma Rocheteau · Ioana Bica · Pietro Lió · Ari Ercole 🔗 Fri 9:30 a.m. - 10:30 a.m. Treatment-RSPN: Recurrent Sum-Product Networks for Sequential Treatment Regimes (Poster)  link » Sum-product networks (SPNs) have recently emerged as a novel deep learning architecture enabling highly efficient probabilistic inference. Since their introduction, SPNs have been applied to a wide range of data modalities and extended to time-sequence data. In this paper, we propose a general framework for modelling sequential treatment decision-making behaviour and treatment response using recurrent sum-product networks (RSPNs). Models developed using our framework benefit from the full range of RSPN capabilities, including the abilities to model the full distribution of the data, to seamlessly handle latent variables, missing values and categorical data, and to efficiently perform marginal and conditional inference. Our methodology is complemented by a novel variant of the expectation-maximization algorithm for RSPNs, enabling efficient training of our models. We evaluate our approach on a synthetic dataset as well as real-world data from the MIMIC-IV intensive care unit medical database. Our evaluation demonstrates that our approach can closely match the ground-truth data generation process on synthetic data and achieve results close to neural and probabilistic models while using a tractable and highly versatile model. Link » Adam Dejl · Jonathan Fei · Ardavan Saeedi · Li-wei Lehman 🔗 Fri 9:30 a.m. - 10:30 a.m. Learning Absorption Rates in Glucose-Insulin Dynamics from Meal Covariates (Poster)  link » Traditional models of glucose-insulin dynamics rely on heuristic parameterizations chosen to fit observations within a laboratory setting. However, these models cannot describe glucose dynamics in daily life. One source of failure is in their descriptions of glucose absorption rates after meal events. A meal's macronutritional content has nuanced effects on the absorption profile, which is difficult to model mechanistically. In this paper, we propose to learn the effects of macronutrition content from glucose-insulin data and meal covariates. Given macronutrition information and meal times, we use a neural network to predict an individual's glucose absorption rate. We use this neural rate function as the control function in a differential equation of glucose dynamics, enabling end-to-end training. On simulated data, our approach is able to closely approximate true absorption rates, resulting in better forecast than heuristic parameterizations, despite only observing glucose, insulin, and macronutritional information. Our work readily generalizes to meal events with higher-dimensional covariates, such as images, setting the stage for glucose dynamics models that are personalized to each individual's daily life. Link » Ke Alexander Wang · Matthew Levine · Jiaxin Shi · Emily Fox 🔗 Fri 9:30 a.m. - 10:30 a.m. A Preliminary Study on Pattern Reconstruction for Optimal Storage of Wearable Sensor Data (Poster)  link » Efficient querying and retrieval on healthcare data is a critical challenge that we are facing today as connected devices are continuously generating petabytes of images, text, and internet of things (IoT) sensor data. One approach to efficiently store the healthcare data is to extract the relevant and representative features and store only those features instead of the continuous streaming data. However, it raises a question as to the amount of information content we can retain from the data and if we can reconstruct the pseudo-original data when needed. By facilitating relevant and representative feature extraction, storage and reconstruction of near original pattern, we aim to address some of the challenges faced by the explosion of the streaming data. We present a preliminary study, where we explored multiple autoencoders for the concise feature extraction and reconstruction for human activity recognition (HAR) sensor data. Our Multi-Layer Perceptron (MLP) deep autoencoder achieved a storage reduction of 90.18%, where as convolutional autoencoder achieved 11.18%. For Long-Short Term Memory (LSTM) autoencoder the reduction was 91.47% and for convolutional LSTM autoencoder it was 72.35%. The storage reduction depended on the size and dimension of the concise representation. For higher dimensions of the representation, the storage reduction was low. But relevant information retention was high which was validated by the classification performed on the reconstructed data. Link » Sazia Mahfuz · Farhana Zulkernine 🔗 Fri 9:30 a.m. - 10:30 a.m. Joint Point Process Model for Counterfactual Treatment-Outcome Trajectories Under Policy Interventions (Poster)  link » Policy makers need to predict the progression of an outcome before adopting a new treatment policy, which defines when and how a sequence of treatments affecting the outcome occurs in continuous time. Commonly, algorithms that predict interventional future outcome trajectories take a fixed sequence of future treatments as input. This excludes scenarios where the policy is unknown or a counterfactual analysis is needed. To handle these limitations, we develop a joint model for treatments and outcomes, which allows for the estimation of treatment policies and effects from sequential treatment--outcome data. It can answer interventional and counterfactual queries about interventions on treatment policies, as we show with a realistic semi-synthetic simulation study. This abstract is based on work that is currently under review (Anonymous). Link » Çağlar Hızlı · ST John · Anne Juuti · Tuure Saarinen · Kirsi Pietiläinen · Pekka Marttinen 🔗 Fri 9:30 a.m. - 10:30 a.m. Continual Learning on Auxiliary tasks via Replayed Experiences: CLARE (Poster)  link » In healthcare, it is common for the initial goal of modeling to be the prediction of critical but rare tasks (e.g. septic shock, cardiac arrest). The reality upon deployment of such a model is often different - the goal now is to assess the risk of a patient towards these (and other) critical events. The labels and their distribution of this auxiliary goal are different from the initial labels used for training the model. Continual Learning frameworks serve as an excellent way to update a model given new data, after it has been deployed in a production environment. We introduce CLARE, a Continual Learning framework which first pre-trains on a rare task (e.g. cardiac arrest), then updates according to the labels of assessed risk, collected from the clinicians in real time - a related task. We develop a novel replay-based method to sequentially learn from new data with a different label distribution. We compare our method to a model trained in a cumulative fashion as well as one that randomly replays earlier samples it has seen. We benchmark classification architectures on a simulated dataset as well as on a clinical dataset of physiological signals. Link » Bohdan Naida · Addison Weatherhead · Sana Tonekaboni · Anna Goldenberg 🔗 Fri 9:30 a.m. - 10:30 a.m. Real-world Challenges in Leveraging Electrocardiograms for Coronary Artery Disease Classification (Poster)  link » This work investigates coronary artery disease (CAD) prediction from electrocardiogram (ECG) data taking into account different windows with respect to the time of diagnosis. We report that ECG waveform measurements automatically collected during ECG recordings contain sufficient features for good classification of CAD using machine learning models up to five years before diagnosis. On the other hand, convolutional neural networks trained on the ECG signals themselves appear to best extract CAD related features when processing data collected one year after a diagnosis is made. Through this work we demonstrate that the type of ECG data and the time window with respect to diagnosis should guide model selection. Link » Jessica De Freitas · Alexander Charney · Isotta Landi 🔗 Fri 9:30 a.m. - 10:30 a.m. Empirical Evaluation of Data Augmentations for Biobehavioral Time Series Data with Deep Learning (Poster)  link » Deep learning has performed remarkably well on many tasks recently. However, the superior performance of deep models relies heavily on the availability of a large number of training data, which limits the wide adaptation of deep models on various clinical and affective computing tasks, as the labeled data are usually very limited. As an effective technique to increase the data variability and thus train deep models with better generalization, data augmentation (DA) is a critical step for the success of deep learning models on biobehavioral time series data. However, the effectiveness of various DAs for different datasets with different tasks and deep models is understudied for biobehavioral time series data. In this paper, we first systematically review eight basic DA methods for biobehavioral time series data, and evaluate the effects on seven datasets with three backbones. Next, we explore adapting more recent DA techniques ($\textit{i.e., automatic augmentation, random augmentation}$) to biobehavioral time series data. Last, we try to answer the question of why a DA is effective ($\textit{or not}$) by first summarizing two desired attributes for augmentations ($\textit{challenging}$ and $\textit{faithful}$), and then utilizing two metrics to quantitatively measure the corresponding attributes. We find that an effective DA needs to generate challenging but still faithful transformations, which can guide us in the search for more effective DA for biobehavioral time series data. Link » Huiyuan Yang · Han Yu · Akane Sano 🔗 Fri 9:30 a.m. - 10:30 a.m. Predicting Individual Depression Symptoms from Acoustic Features During Speech (Poster)  link » Current automatic depression detection systems provide predictions directly without relying on the individual symptoms/items of depression as denoted in the clinical depression rating scales. In contrast, clinicians assess each item in the depression rating scale in a clinical setting, thus implicitly providing a more detailed rationale for a depression diagnosis. In this work, we make a first step towards using the acoustic features of speech to predict individual items of the depression rating scale before obtaining the final depression prediction. For this, we use convolutional (CNN) and recurrent (long short-term memory (LSTM)) neural networks. We consider different approaches to learning the temporal context of speech. Further, we analyze two variants of voting schemes for individual item prediction and depression detection. We also include an animated visualization that shows an example of item prediction over time as the speech progresses. Link » Sebastian Rodriguez · Sri Harsha Dumpala · Katerina Dikaios · Sheri Rempel · Rudolf Uher · Sageev Oore 🔗 Fri 9:30 a.m. - 10:30 a.m. Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings (Poster)  link » We study the problem of time series imputation in multivariate neural recordings. Compared to standard time series imputation settings, new challenges for imputing neural recordings include the lack of adjacent timestamps for electrodes missing over days, and generalization across days and participants with different electrode configurations. Due to these challenges, the standard practice in neuroscience is to discard electrodes with missing data, even if only a part of the recording is corrupted, significantly reducing the already limited and difficult-to-obtain data. In this paper, we establish Deep Neural Imputation (DNI), a framework to recover missing electrode recordings by learning across sessions, spatial locations, and participants. We first instantiate DNI with natural linear baselines, then develop encoder-decoder approaches based on masked electrode modeling. We evaluate DNI on 12 multielectrode, human neural datasets with naturalistic behavior. We demonstrate DNI's data imputation ability across a broad range of metrics as well as integrate DNI into an existing neural data analysis pipeline. Link » Sabera Talukder · Jennifer J Sun · Matthew Leonard · Bingni Brunton · Yisong Yue 🔗 Fri 9:30 a.m. - 10:30 a.m. Improving ECG-based COVID-19 diagnosis and mortality predictions using pre-pandemic medical records at population-scale (Poster)  link » Pandemic outbreaks such as COVID-19 occur unexpectedly, and need immediate action due to their potential devastating consequences on global health. Point-of-care routine assessments such as electrocardiogram (ECG), can be used to develop prediction models for identifying individuals at risk. However, there is often too little clinically-annotated medical data, especially in early phases of a pandemic, to develop accurate prediction models. In such situations, historical pre-pandemic health records can be utilized to estimate a preliminary model, which can then be fine-tuned based on limited available pandemic data. This study shows this approach -- pre-train deep learning models with pre-pandemic data -- can work effectively, by demonstrating substantial performance improvement over three different COVID-19 related diagnostic and prognostic prediction tasks. Similar transfer learning strategies can be useful for developing timely artificial intelligence solutions in future pandemic outbreaks. Link » Weijie Sun · Sunil Vasu Kalmady · Nariman Sepehrvand · Luan Chu · Zihan Wang · Amir Salimi · Abram Hindle · Russell Greiner · Padma Kaul 🔗 Fri 9:30 a.m. - 10:30 a.m. Performative Prediction in Time Series: A Case Study (Poster)  link » Performative prediction is a phenomenon where a model’s predictions, or the decisions based on these predictions, may influence the outcomes of the model. This is especially conspicuous in a time series forecasting setting where interventions occur before outcomes are observed. These interventions dictate which data points in the time series can be used as inputs for future predictions. In this paper, we represent a patient’s symptom levels along their cancer rehabilitation plight as a time series. We use a decision-tree based model to predict the future symptom values of a patient. Based on these predictions, clinicians decide which symptom levels will be observed in the future. We propose methods to mitigate the problem of performative prediction in time series. Our results show how performative prediction may lead to a 29.4% to 40.7% higher error across different symptoms. Link » Rupali Bhati · Jennifer Jones · Audrey Durand 🔗 Fri 9:30 a.m. - 10:30 a.m. Supervised change-point detection with dimension reduction, applied to physiological signals (Poster)  link » This paper proposes an automatic method to calibrate change point detection algorithms for high-dimensional time series. Our procedure builds on the ability of an expert (e.g. a medical researcher) to produce approximate segmentation estimates, called partial annotations, for a small number of signal examples. This contribution is a supervised approach to learn a diagonal Mahalanobis metric, which, once combined with a detection algorithm, is able to reproduce the expert's segmentation strategy on out-of-sample signals. Unlike previous works for change detection, our method includes a sparsity-inducing regularization which perform supervised dimension selection, and adapts to partial annotations. Experiments on activity signals collected from healthy and neurologically impaired patients support the fact that supervision markedly ameliorate detection accuracy. Link » Charles Truong · Laurent Oudre 🔗 Fri 9:30 a.m. - 10:30 a.m. Semi-Supervised Learning and Data Augmentation for Wearable-based Health Monitoring System in the Wild (Poster)  link » Physiological and behavioral data collected from wearable or mobile sensors have been used to detect human health conditions. Sometimes the health-related annotation relies on self-reported surveys during the study, thus a limited amount of labeled data can be an obstacle in developing accurate and generalized predicting models. On the other hand, the sensors can continuously capture signals without labels. This work investigates leveraging unlabeled wearable sensor data for health condition detection. We first applied data augmentation techniques to increase the amount of training data by adding noise to the original physiological and behavioral sensor data and improving the robustness of supervised stress detection models. Second, to leverage the information learned from unlabeled samples, we pre-trained the supervised model structure using an auto-encoder and actively selected unlabeled sequences to filter noisy data. Then, we combined data augmentation techniques with consistency regularization, which enforces the consistency of prediction output based on augmented and original unlabeled data. We validated these methods in sensor-based in wild stress detection tasks using 3 wearable/mobile sensor datasets collected in the wild. Our results showed that the proposed methods improved stress classification performance by 5.3% to 13.8%, compared to the baseline supervised learning models. In addition, our method showed competitive performances compared to state-of-the-art semi-supervised learning methods in the literature. Link » Han Yu · Akane Sano 🔗 Fri 9:30 a.m. - 10:30 a.m. An Electrocardiogram-Based Risk Score for Cardiovascular Mortality (Poster)  link » The electrocardiogram (ECG) is the most frequently performed cardiovascular diagnostic test, but it is unclear how much information resting ECGs contain about long term cardiovascular risk. Using a dataset of 312,422 resting 12-lead ECGs collected at [Medical Center 1, redacted for anonymity], we developed SEER, the preScient Estimator of Electrocardiogram Risk. SEER predicts five-year cardiovascular mortality with an area under the receiver operator characteristic curve (AUC) of 0.83 in a held-out test set at [Medical Center 1], and with AUCs of 0.79 and 0.83 when independently evaluated at [Medical Center 2] and [Medical Center 3] respectively. SEER predicts 5-year atheroscleroitc disease (ASCVD) with an AUC of 0.67 and is close in performance to the Pooled Cohort Equations for ASCVD Risk while being only modestly correlated. SEER has the potential to provide value by stratifying patients beyond current clinical practice. Link » John Hughes · David Ouyang · Pierre Elias · James Zou · Euan Ashley · Marco Perez 🔗 Fri 9:30 a.m. - 10:30 a.m. A Framework for the Evaluation of Clinical Time Series Models (Poster)  link » Early detection of critical events is one of the mainstays of clinical time series prediction tasks. As data from electronic health records become larger in volume and availability increases, models that can predict critical events before they occur and inform clinical decision making have the potential to transform aspects of clinical care. There has been a recent surge in literature looking at early detection in the context of clinical time series. However, methods used to evaluate clinical time series models in which multiple predictions per time series are made often do not adequately measure the utility of the models in the clinical setting. Classical metrics such as the Area Under the Receiver Operating Characteristic (AUROC) and the Area Under the Precision Recall Curve (AUPRC) fail to fully capture the true, real-world performance of these models. In this work, we i)propose a method to evaluate early prediction models in a way that is consistent with their application in the clinical setting, and ii) provide a fast, open-source, and native cross-platform implementation. Link » Michael Gao · Jiayu Yao · Ricardo Henao 🔗 Fri 9:30 a.m. - 10:30 a.m. Deep-learning-based characterization of glucose biomarkers to identify type 2 diabetes, prediabetes, and healthy individuals (Poster)  link » Type 2 Diabetes (T2D) is a common chronic disease that can lead to serious comorbidities. Prediabetes is a state of increased health risk that is defined by abnormal glucose homeostasis and is strongly associated with the development of T2D and diabetic complications. Novel diagnostic or screening tools are required to identify T2D and prediabetic patients. In this study, we developed a predictive model that uses continuous glucose monitoring (CGM) signals to classify individuals as T2D, prediabetic, or healthy. We tested different durations of CGM signals to determine the minimum length of time required to achieve a reliable prediction of diabetic outcomes. We found that 12 hours of CGM signals were sufficient to achieve a classifier with a high degree of accuracy. The performance of the 12-hour model was equivalent to the performance of a model using the full period of CGM signals. The 12-hour model achieved AUCs of 0.83, 0.69, and 0.77 to identify T2D, prediabetes, and healthy individuals, respectively. The overall AUC of the 12-hour ensemble model was 0.86. Our findings propose a new application of currently available CGM systems to identify T2D and prediabetes based on only a short-time series of glucose profiles. Link » Sina Akbarian · Qayam Jetha · Jouhyun Jeon 🔗 Fri 9:30 a.m. - 10:30 a.m. Temporal patterns in insulin needs for Type 1 diabetes (Poster)  link » Type 1 Diabetes (T1D) is a chronic condition where the body produces little or no insulin, a hormone required for the cells to use blood glucose (BG) for energy and to regulate BG levels in the body. Finding the right insulin dose and time remains a complex, challenging and as yet unsolved control task. In this study, we use the OpenAPS Data Commons dataset, which is an extensive open-source dataset collected in real-life conditions, to discover temporal patterns in insulin need that include well-known factors such as carbohydrates as well as novel factors too. We utilised various time series techniques to spot such patterns using matrix profile and multi-variate clustering. The better we understand T1D and the factors impacting insulin needs, the more we can contribute to building data-driven technology for T1D treatments. Link » Isabella Degen · Zahraa Abdallah 🔗 Fri 9:30 a.m. - 10:30 a.m. Fair Multimodal Checklists for Interpretable Clinical Time Series Prediction (Poster)  link » Checklists are interpretable and easy-to-deploy models often used in real-world clinical decision-making. Prior work has demonstrated that checklists can be learned from binary input features in a data-driven manner by formulating the training objective as an integer programming problem. In this work, we learn diagnostic checklists for the task of phenotype classification with time series vitals data of ICU patients from the MIMIC-IV dataset. For 13 clinical phenotypes, we fully explore the empirical behavior of the checklist model in regard to multimodality, time series dynamics, and fairness. Our results show that the addition of the imaging data modality and the addition of shapelets that capture time series dynamics can significantly improve predictive performance. Checklist models optimized with explicit fairness constraints achieve the target fairness performance, at the expense of lower predictive performance. Link » Qixuan Jin · Haoran Zhang · Thomas Hartvigsen · Marzyeh Ghassemi 🔗 Fri 9:30 a.m. - 10:30 a.m. Deep Fitness Inference for Drug Discovery with Directed Evolution (Poster)  link » Directed evolution, with iterated mutation and human-designed selection, is a powerful approach for drug discovery. Here, we establish a fitness inference problem given on-target and off-target time series DNA sequencing data. We describe maximum likelihood solutions for the nonlinear dynamical system induced by fitness-based competition. Our approach learns from multiple time series rounds in a principled manner, in contrast to prior work focused on two-round enrichment prediction. While fitness inference does not require deep learning in principle, we show that inferring fitness while jointly learning a sequence-to-fitness transformer (DeepFitness) improves performance over a non-deep baseline, and a two-round enrichment baseline. Finally, we highlight how DeepFitness can improve the diversity of the discovered hits in a directed evolution experiment. Link » Nathaniel Diamant · Ziqing Lu · Christina Helmling · Kangway Chuang · Christian Cunningham · Tommaso Biancalani · Gabriele Scalia · Max Shen 🔗 Fri 9:30 a.m. - 10:30 a.m. On the Importance of Clinical Notes in Multi-modal Learning for EHR Data (Poster)  link » Understanding deep learning model behavior is critical to accepting machine learning-based decision support systems in the medical community. Previous works have shown that jointly using clinical notes with electronic health record (EHR) data improved predictive performance for patient monitoring in the intensive care unit (ICU). In this work, we explore the underlying reasons for these improvements. While relying on a basic attention-based model to allow for interpretability, we first confirm that performance significantly improves over state-of-the-art EHR data models when combining EHR data and clinical notes. We then provide an analysis showing improvements arise almost exclusively from a subset of notes containing broader context on patient state rather than clinician notes. We believe such findings highlight deep learning models for EHR data to be more limited by partially-descriptive data than by modeling choice, motivating a more data-centric approach in the field. Link » Severin Husmann · Hugo Yèche · Gunnar Rätsch · Rita Kuznetsova 🔗 Fri 9:30 a.m. - 10:30 a.m. Personalized Dose Guidance using Safe Bayesian Optimization (Poster)  link » This work considers the problem of personalized dose guidance using Bayesian optimization that learns the optimum drug dose tailored to each individual, thus improving therapeutic outcomes. Safe learning using interior point method guarantees safety with high probability. This is demonstrated using the problem of learning the optimum bolus insulin dose in patients with type 1 diabetes to counteract the effect of meal consumption. Starting from no a priori information about the patients, our dose guidance algorithm is able to improve the therapeutic outcome (measured in terms of % time-in-range) without jeopardizing patient safety. Other potential healthcare applications are also discussed. Link » 🔗 Fri 9:30 a.m. - 10:30 a.m. Dissecting In-the-Wild Stress from Multimodal Sensor Data (Poster)  link » Stress is associated with numerous chronic health conditions (both physical and mental). However, the effect of stress on individuals is understudied, leaving crucial questions unanswered. In particular, how variable is stress within and among individuals? In this work, we unveil preliminary findings from a major data collection effort from Digital Health Technologies (DHTs, such as smart rings and smartphones) and provide insights into stress in-the-wild. We use causal discovery to learn robust representations of stress in this population. Our findings reveal high levels of inter- and intra-individual heterogeneity in stress. This study is an important first step in better understanding potential underlying processes reflective of stress in individuals. Link » 🔗 Fri 10:30 a.m. - 11:30 a.m. Mentorship Lunch Break (Break) 🔗 Fri 11:30 a.m. - 12:00 p.m. Invited speaker - David Sontag, MIT (Talk) 🔗 Fri 12:00 p.m. - 12:30 p.m. Invited speaker - Emily Fox, Stanford (Talk) 🔗 Fri 12:30 p.m. - 1:00 p.m. Spotlight talks for poster session II (Spotlight) 🔗 Fri 12:30 p.m. - 1:00 p.m. Prediction-Constrained Markov Models for Medical Time Series with Missing Data and Few Labels (Spotlight)  link »    When predicting outcomes for hospitalized patients, two key challenges are that the time series features are frequently missing and that supervisory labels may be available for only some sequences. While recent work has offered deep learning solutions, we consider a far simpler approach using the Hidden Markov model (HMM). Our probabilistic approach handles missing features via exact marginalization rather than imputation, thereby avoiding predictions that depend on specific guesses of missing values that do not account for uncertainty. To add effective supervision, we show that a prediction-constrained (PC) training objective can deliver high-quality predictions as well as interpretable generative models. When predicting mortality risk on two large health records datasets, our PC-HMM's precision-recall performance is equal or better than the common GRU-D even with 100x fewer parameters. Furthermore, when only a small fraction of sequences have labels, our PC-HMM approach can beat time-series adaptations of MixMatch, FixMatch, and other state-of-the-art methods for semi-supervised deep learning. Link » Preetish Rath · Gabe Hope · Kyle Heuton · Erik Sudderth · Michael Hughes 🔗 Fri 12:30 p.m. - 1:00 p.m. Time-constrained decision making in deceased donor kidney allocation (Spotlight)  link »    Deceased donor kidney allocation is a challenging sequential decision making problem constrained by the limited time that the kidney is medically viable. The decision made at each time point is a tradeoff between preserving equity (i.e., to offer the kidney to the next person on the waiting list) and seeking efficiency (i.e., to expedite to a more accepting patient lower down on the waiting list to avoid discard). Under the current allocation system, organ procurement organizations (OPOs) make ad-hoc decisions on when to prioritize efficiency over equity, leading to uneven treatment for patients skipped on the waitlist. We develop models to predict whether a donor will be hard-to-place based on the initial medical context of this sequential decision process, achieving a balanced accuracy of 80.2%. We improve balanced accuracy to 94.0% by adjusting predictions based on the sequentially updated medical contexts, that is, information accumulated during a kidney's match run. Our model can inform OPOs on whether to expedite a kidney based on their current context. We discuss associated implementation challenges, including those related to equity. Link » Nikhil Agarwal · Itai Ashlagi · Grace Guan · Paulo Somaini · Jiacheng Zou 🔗 Fri 12:30 p.m. - 1:00 p.m. Unsupervised Deep Metric Learning for the inference of hemodynamic value with Electrocardiogram signals (Spotlight)  link »    An objective assessment of intrathoracic pressures remains an important diagnostic method for patients with heart failure. Although cardiac catheterization is the gold standard for estimating central hemodynamic pressures, it is an invasive procedure where a pressure transducer is inserted into a great vessel and threaded into the right heart chambers. Approaches that leverage non-invasive signals – such as the electrocardiogram (ECG) – have the promise to make the routine estimation of cardiac pressures feasible in both inpatient and outpatient settings. Prior models that were trained in a supervised fashion to estimate central pressures have shown good discriminatory ability over a heterogeneous cohort when the number of training examples is large. As obtaining central pressures (the labels) requires an invasive procedure that can only be performed in an inpatient setting, acquiring large labeled datasets for different patient cohorts is challenging. In this work, we leverage a dataset that contains over 5.4 million ECGs, without concomitant central pressure labels, to improve the performance of models trained with sparsely labeled datasets. Using a deep metric learning (DML) objective function, we develop a procedure for building latent 12-lead ECG representations and demonstrate that these latent representations can be used to improve the discriminatory performance of a model trained in a supervised fashion on a smaller labeled dataset. More generally, our results show that training with DML objectives with both labeled and unlabeled ECGs showed the downstream performance on par with the supervised baseline. Link » Hyewon Jeong · Marzyeh Ghassemi · Collin Stultz 🔗 Fri 12:30 p.m. - 1:00 p.m. Wearable-based Human Activity Recognition with Spatio-Temporal Spiking Neural Networks (Spotlight)  link »    We study the Human Activity Recognition (HAR) task, which predicts user daily activity based on time series data from wearable sensors. Recently, researchers use end-to-end Artificial Neural Networks (ANNs) to extract the features and perform the classification in HAR. However, ANNs incur huge computation burdens to wearables devices and lacks temporal feature extraction. In this work, we leverage Spiking Neural Networks (SNNs)—an architecture inspired by biological neurons—to HAR tasks. SNNs allow spatio-temporal extraction of features and enjoy low-power computation by binary spikes. We conduct extensive experiments on three HAR datasets with SNNs, demonstrating that SNNs are on par with ANNs in terms of accuracy while reducing up to 94% energy consumption. Link » Yuhang Li · Ruokai Yin · Hyoungseob Park · Youngeun Kim · Priyadarshini Panda 🔗 Fri 12:30 p.m. - 1:00 p.m. Sleep and Activity Prediction for Type 2 Diabetes Management using Continuous Glucose Monitoring (Spotlight)  link »    Continuous glucose monitors (CGMs) generate frequent glucose measurements, and numerous studies suggest that these devices may improve diabetes management. These devices support behavior change and self-management by giving people with diabetes real-time visibility into how behavioral and lifestyle factors, i.e., meals, physical activity, sleep, stress, and medication adherence, drive their glucose levels. While earlier studies have shown that individual's actions can influence their CGM data, it has not been clear whether CGM data can provide information about these actions. This is the first study to show on a large cohort that CGM can provide information about sleep and physical activities. We first train a neural network model to determine the sequence of daily activities from CGM signals, and then extend the model to use additional data, such as individual demographics and medical claims history. Using data from 6981 participants in a Type 2 diabetes (T2D) management program, we show that a model combining an individual's CGM, demographics, and claims data is highly predictive of sleep (AUROC 0.947), and moderately predictive of a range of physical activities (AUROCs of 0.722-0.817). These results show that CGM may have wider utility as a tool for behavior change than previously known. Link » Kimmo Karkkainen · Gregory Lyng · Brian Hill · Kailas Vodrahalli · Jeffrey Hertzberg · Eran Halperin 🔗 Fri 12:30 p.m. - 1:00 p.m. Contrastive Pre-Training for Multimodal Medical Time Series (Spotlight)  link »    Clinical time series data are highly rich and provide significant information about a patient's physiological state. However, these time series can be complex to model, particularly when they consist of multimodal data measured at different resolutions. Most existing methods to learn representations of these data consider only tabular time series (e.g., lab measurements and vitals signs), and do not naturally extend to modelling a full, multimodal time series. In this work, we propose a contrastive pre-training strategy to learn representations of multimodal time series. We consider a setting where the time series contains sequences of (1) high-frequency electrocardiograms and (2) structured data from labs and vitals. We outline a strategy to generate augmentations of these data for contrastive learning, building on recent work in representation learning for medical data. We evaluate our method on a real-world dataset, finding it obtains improved or competitive performance when compared to baselines on two downstream tasks. Link » Aniruddh Raghu · Payal Chandak · Ridwan Alam · John Guttag · Collin Stultz 🔗 Fri 12:30 p.m. - 1:00 p.m. Dynamic Survival Transformers for Causal Inference with Electronic Health Records (Spotlight)  link »    In medicine, researchers often seek to infer the effects of a given treatment on patients' outcomes, such as the expected time until infection. However, the standard methods for causal survival analysis make simplistic assumptions about the data-generating process and cannot capture complex interactions among patient covariates. We introduce the Dynamic Survival Transformer (DynST), a deep survival model that trains on electronic health records (EHRs). Unlike previous transformers used in survival analysis, DynST can make use of time-varying information to predict evolving survival probabilities. We derive a semi-synthetic EHR dataset from MIMIC-III to show that DynST can accurately estimate the causal effect of a treatment intervention on restricted mean survival time (RMST). We demonstrate that DynST achieves better predictive and causal estimation than two alternative models. Link » Prayag Chatha · Yixin Wang · Zhenke Wu · Jeffrey Regier 🔗 Fri 12:30 p.m. - 1:00 p.m. sEHR-CE: Language modelling of structured EHR data for efficient and generalizable patient cohort expansion (Spotlight)  link »    Electronic health records (EHR) offer unprecedented opportunities for in-depth clinical phenotyping and prediction of clinical outcomes. Combining multiple data sources is crucial to generate a complete picture of disease prevalence, incidence and trajectories. The standard approach to combining clinical data involves collating clinical terms across different terminology systems using curated maps, which are often inaccurate and/or incomplete. Here, we propose sEHR-CE, a novel framework based on transformers to enable integrated phenotyping and analyses of heterogeneous clinical datasets without relying on these mappings. We unify clinical terminologies using textual descriptors of concepts, and represent individuals’ EHR as sections of text. We then fine-tune pre-trained language models to predict disease phenotypes more accurately than non-text and single terminology approaches. We validate our approach using primary and secondary care data from the UK Biobank, a large-scale research study. Finally, we illustrate in a type 2 diabetes use case how sEHR-CE identifies individuals without diagnosis that share clinical characteristics with patients. Link » Anna Munoz-Farre · Harry Rose · Aylin Cakiroglu 🔗 Fri 12:30 p.m. - 1:00 p.m. Modeling MRSA decolonization: Interactions between body sites and the impact of site-specific clearance (Spotlight)  link » MRSA colonization is a critical public health concern. Decolonization protocols have been designed for the clearance of MRSA. Successful decolonization protocols reduce disease incidence; however, multiple protocols exist, comprising diverse therapies targeting multiple body sites, and the optimal protocol is unclear. Here, we formulate a machine learning model using data from a randomized controlled trial (RCT) of MRSA decolonization, which estimates interactions between body sites, quantifies the contribution of each therapy to successful decolonization, and enables predictions of the efficacy of therapy combinations. This work shows how a machine learning model can help design and improve complex clinical protocols. Link » 🔗 Fri 1:00 p.m. - 2:00 p.m. Poster Session II - Coffee break (Poster Session) 🔗 Fri 1:00 p.m. - 2:00 p.m. Continuous Time Evidential Distributions for Processing Irregular Time Series (Poster)  link » The proper handling of irregular time series is a significant challenge when formulating predictions from health data. It is difficult to infer the value of any one feature at a given time when observations are sporadic, as a missing feature could take on a large range of values depending on when it was last observed. To characterize this uncertainty directly, we propose a strategy that learns an evidential distribution over irregular time series in continuous time. We demonstrate that this method provides stable, temporally correlated predictions and corresponding uncertainty estimates based on the evidence gained with each collected observation. The continuous time evidential distribution enables flexible inference of the evolution of the partially observed features at any time of interest, while expanding uncertainty temporally for sparse, irregular observations. We envision that this inference process may support robust sequential decision making processes in clinical settings such as feature acquisition or treatment effect estimation. Link » Taylor Killian · Ava Soleimany 🔗 Fri 1:00 p.m. - 2:00 p.m. Prediction-Constrained Markov Models for Medical Time Series with Missing Data and Few Labels (Poster)  link » When predicting outcomes for hospitalized patients, two key challenges are that the time series features are frequently missing and that supervisory labels may be available for only some sequences. While recent work has offered deep learning solutions, we consider a far simpler approach using the Hidden Markov model (HMM). Our probabilistic approach handles missing features via exact marginalization rather than imputation, thereby avoiding predictions that depend on specific guesses of missing values that do not account for uncertainty. To add effective supervision, we show that a prediction-constrained (PC) training objective can deliver high-quality predictions as well as interpretable generative models. When predicting mortality risk on two large health records datasets, our PC-HMM's precision-recall performance is equal or better than the common GRU-D even with 100x fewer parameters. Furthermore, when only a small fraction of sequences have labels, our PC-HMM approach can beat time-series adaptations of MixMatch, FixMatch, and other state-of-the-art methods for semi-supervised deep learning. Link » Preetish Rath · Gabe Hope · Kyle Heuton · Erik Sudderth · Michael Hughes 🔗 Fri 1:00 p.m. - 2:00 p.m. Time-constrained decision making in deceased donor kidney allocation (Poster)  link » Deceased donor kidney allocation is a challenging sequential decision making problem constrained by the limited time that the kidney is medically viable. The decision made at each time point is a tradeoff between preserving equity (i.e., to offer the kidney to the next person on the waiting list) and seeking efficiency (i.e., to expedite to a more accepting patient lower down on the waiting list to avoid discard). Under the current allocation system, organ procurement organizations (OPOs) make ad-hoc decisions on when to prioritize efficiency over equity, leading to uneven treatment for patients skipped on the waitlist. We develop models to predict whether a donor will be hard-to-place based on the initial medical context of this sequential decision process, achieving a balanced accuracy of 80.2%. We improve balanced accuracy to 94.0% by adjusting predictions based on the sequentially updated medical contexts, that is, information accumulated during a kidney's match run. Our model can inform OPOs on whether to expedite a kidney based on their current context. We discuss associated implementation challenges, including those related to equity. Link » Nikhil Agarwal · Itai Ashlagi · Grace Guan · Paulo Somaini · Jiacheng Zou 🔗 Fri 1:00 p.m. - 2:00 p.m. An SNN Based ECG Classifier For Wearable Edge Devices (Poster)  link » In situ real time monitoring of ECG signal at wearables and implantables such as smart watch, ILR, Pacemaker etc. are crucial for early clinical intervention of Cardio-Vascular diseases. Existing deep learning based techniques are not suitable to run on such low-power, low-memory, battery driven devices. In this paper, we have designed and implemented a reservoir based SNN and a Feed-forward SNN, and compared their performances for ECG pattern classification along with a new Peak-based spike encoder and two other spike encoders. Feed-forward SNN coupled with peak-based encoder is observed to deliver the best performance spending least computational effort and thus minimal power consumption. Therefore, this SNN based system running on Neuromorphic Computing (NC) platforms can be a suitable solution for ECG pattern classification at the wearable edge. Link » Dighanchal Banerjee · Sounak Dey · Arpan Pal 🔗 Fri 1:00 p.m. - 2:00 p.m. DeepJoint: Robust Survival Modelling Under Clinical Presence Shift (Poster)  link » Medical data arise from the complex interaction between patients and healthcare systems. This data-generating process often constitutes an informative process. Prediction models often ignore this process, potentially hampering performance and transportability when this interaction evolves. This work explores how current practices may suffer from shifts in this clinical presence process and proposes a multi-task recurrent neural network to tackle this issue. The proposed joint modelling performs similarly to state-of-the-art predictive models on a real-world prediction task. More importantly, the approach appears more robust to change in the clinical presence setting. This analysis emphasises the importance of modelling clinical presence to improve performance and transportability. Link » Vincent Jeanselme · Glen Martin · Niels Peek · Matthew Sperrin · Brian Tom · Jessica Barrett 🔗 Fri 1:00 p.m. - 2:00 p.m. Are you asleep when your phone is asleep? Semi-supervised methods to infer sleep from smart devices (Poster)  link » Sleep is a vital aspect of our life. Having a good quality sleep is necessary for our well-being and health. Therefore, sleep measurements can aid us in improving our sleep quality. While many users are reluctant to use intrusive sleep sensing techniques such as wearables, passive sensing such as network activity of smart phone devices can be utilized to measure the sleep duration of a user. However, to develop accurate sleep prediction models, we need large amounts of labeled data. In addition, due to heterogeneity in user behaviors, hardware and software of the devices used, a single model may not generalise to every user in a given population. Although ground truth data collection from a large population is costly and challenging, unlabelled network activity data is easy to gather using mobile applications or network logs. This motivates us to look for semi-supervised learning approaches to leverage unlabelled data from the users to develop accurate sleep prediction models. Our results show that semi-supervised learning techniques can be used to improve the accuracy of sleep duration estimation from smart devices. Link » PRIYANKA MARY MAMMEN · Prashant Shenoy 🔗 Fri 1:00 p.m. - 2:00 p.m. Unsupervised Deep Metric Learning for the inference of hemodynamic value with Electrocardiogram signals (Poster)  link » An objective assessment of intrathoracic pressures remains an important diagnostic method for patients with heart failure. Although cardiac catheterization is the gold standard for estimating central hemodynamic pressures, it is an invasive procedure where a pressure transducer is inserted into a great vessel and threaded into the right heart chambers. Approaches that leverage non-invasive signals – such as the electrocardiogram (ECG) – have the promise to make the routine estimation of cardiac pressures feasible in both inpatient and outpatient settings. Prior models that were trained in a supervised fashion to estimate central pressures have shown good discriminatory ability over a heterogeneous cohort when the number of training examples is large. As obtaining central pressures (the labels) requires an invasive procedure that can only be performed in an inpatient setting, acquiring large labeled datasets for different patient cohorts is challenging. In this work, we leverage a dataset that contains over 5.4 million ECGs, without concomitant central pressure labels, to improve the performance of models trained with sparsely labeled datasets. Using a deep metric learning (DML) objective function, we develop a procedure for building latent 12-lead ECG representations and demonstrate that these latent representations can be used to improve the discriminatory performance of a model trained in a supervised fashion on a smaller labeled dataset. More generally, our results show that training with DML objectives with both labeled and unlabeled ECGs showed the downstream performance on par with the supervised baseline. Link » Hyewon Jeong · Marzyeh Ghassemi · Collin Stultz 🔗 Fri 1:00 p.m. - 2:00 p.m. Multi-modal 3D Human Pose Estimation using mmWave, RGB-D, and Inertial Sensors (Poster)  link » The ability to estimate 3D human body pose and movement, also known as human pose estimation~(HPE), enables many applications for home-based health monitoring, such as remote rehabilitation training. Several possible solutions have emerged using sensors ranging from RGB cameras, depth sensors, millimeter-Wave (mmWave) radars, and wearable inertial sensors. Despite previous efforts on datasets and benchmarks for HPE, few dataset exploits multiple modalities and focuses on home-based health monitoring. To bridge this gap, we present human pose estimation using multiple modalities with an in-house dataset. We perform extensive experiments and delineate the strength of each modality. Link » Sizhe An · Yin Li · Umit Ogras 🔗 Fri 1:00 p.m. - 2:00 p.m. Wearable-based Human Activity Recognition with Spatio-Temporal Spiking Neural Networks (Poster)  link » We study the Human Activity Recognition (HAR) task, which predicts user daily activity based on time series data from wearable sensors. Recently, researchers use end-to-end Artificial Neural Networks (ANNs) to extract the features and perform the classification in HAR. However, ANNs incur huge computation burdens to wearables devices and lacks temporal feature extraction. In this work, we leverage Spiking Neural Networks (SNNs)—an architecture inspired by biological neurons—to HAR tasks. SNNs allow spatio-temporal extraction of features and enjoy low-power computation by binary spikes. We conduct extensive experiments on three HAR datasets with SNNs, demonstrating that SNNs are on par with ANNs in terms of accuracy while reducing up to 94% energy consumption. Link » Yuhang Li · Ruokai Yin · Hyoungseob Park · Youngeun Kim · Priyadarshini Panda 🔗 Fri 1:00 p.m. - 2:00 p.m. Automatic Sleep Scoring from Large-scale Multi-channel Pediatric EEG (Poster)  link » Sleep is particularly important to the health of infants, children, and adolescents, and sleep scoring is the first step to accurate diagnosis and treatment of potentially life-threatening conditions. But pediatric sleep is severely under-researched compared to adult sleep in the context of machine learning for health, and sleep scoring algorithms developed for adults usually perform poorly on infants. Here, we present the first automated sleep scoring results on a recent large-scale pediatric sleep study dataset that was collected during standard clinical care. We develop a transformer-based supervised learning model that learns to classify five sleep stages from millions of multi-channel electroencephalogram (EEG) sleep epochs with 78% overall accuracy. Further, we conduct an in-depth analysis of the model performance based on patient demographics and EEG channels. The results point to the growing need for machine learning research on pediatric sleep. Link » Harlin Lee · Aaqib Saeed 🔗 Fri 1:00 p.m. - 2:00 p.m. SurviVAEl: Variational Autoencoders for Clustering Time Series (Poster)  link » Multi-state models are generalizations of time-to-event models, where individuals progress through discrete states in continuous time. As opposed to classical approaches to survival analysis which include only alive-dead transitions, states can be competing in nature and transient, enabling richer modelling of complex clinical event series. Classical multi-state models, such as the Cox-Markov model, struggle to capture idiosyncratic, non-linear, time dependent, or high-dimensional covariates for which more sophisticated machine learning models are needed. Recently proposed extensions can overcome these limitations, however, they do not allow for uncertainty quantification of the model prediction, and typically have limited interpretability at the individual or population level. Here, we introduce SurviVAEl, a multi-state survival framework based on a VAE architecture, enabling uncertainty quantification and interpretable patient trajectory clustering. Link » Stefan Groha · Alexander Gusev · Sebastian Schmon 🔗 Fri 1:00 p.m. - 2:00 p.m. Generalizable Semi-supervised Learning Strategies for Multiple Learning Tasks using 1-D Biomedical Signals (Poster)  link » Progress in the sensors field has enabled collection of biomedical signal data, such as photoplethysmography (PPG), electrocardiogram (ECG), and electroencephalogram (EEG), allowing for application of supervised machine learning techniques such as convolutional neural networks (CNN). However, the cost associated with annotating these biomedical signals is high and prevents the widespread use of such techniques. To address the challenges of generating a large labeled dataset, we adapt and apply semi-supervised learning (SSL) frameworks to a new problem setting, i.e., artifact detection in PPG signal and verified its generalizability in ECG and EEG as well. Our proposed framework is able to leverage unlabeled data to achieve similar PPG artifact detection performance obtained by fully supervised learning approach using only 75 labeled samples, or 0.5\% of the available labeled data. Link » Luca Cerny Oliveira · Zhengfeng Lai · Heather Siefkes · Chen-Nee Chuah 🔗 Fri 1:00 p.m. - 2:00 p.m. MAEEG: Masked Auto-encoder for EEG Representation Learning (Poster)  link » Decoding information from bio-signals such as EEG, using machine learning has been a challenge due to the small data-sets and difficulty to obtain labels. We propose a reconstruction-based self-supervised learning model, the masked auto-encoder for EEG (MAEEG), for learning EEG representations by learning to reconstruct the masked EEG features using a transformer architecture. We found that MAEEG can learn representations that significantly improve sleep stage classification (~5% accuracy increase) when only a small number of labels are given. We also found that input sample lengths and different ways of masking during reconstruction-based SSL pretraining have a huge effect on downstream model performance. Specifically, learning to reconstruct a larger proportion and more concentrated masked signal results in better performance on sleep classification. Our findings provide insight into how reconstruction-based SSL could help representation learning for EEG. Link » Sherry Chien · Hanlin Goh · Christopher Sandino · Joseph Cheng 🔗 Fri 1:00 p.m. - 2:00 p.m. Sleep and Activity Prediction for Type 2 Diabetes Management using Continuous Glucose Monitoring (Poster)  link » Continuous glucose monitors (CGMs) generate frequent glucose measurements, and numerous studies suggest that these devices may improve diabetes management. These devices support behavior change and self-management by giving people with diabetes real-time visibility into how behavioral and lifestyle factors, i.e., meals, physical activity, sleep, stress, and medication adherence, drive their glucose levels. While earlier studies have shown that individual's actions can influence their CGM data, it has not been clear whether CGM data can provide information about these actions. This is the first study to show on a large cohort that CGM can provide information about sleep and physical activities. We first train a neural network model to determine the sequence of daily activities from CGM signals, and then extend the model to use additional data, such as individual demographics and medical claims history. Using data from 6981 participants in a Type 2 diabetes (T2D) management program, we show that a model combining an individual's CGM, demographics, and claims data is highly predictive of sleep (AUROC 0.947), and moderately predictive of a range of physical activities (AUROCs of 0.722-0.817). These results show that CGM may have wider utility as a tool for behavior change than previously known. Link » Kimmo Karkkainen · Gregory Lyng · Brian Hill · Kailas Vodrahalli · Jeffrey Hertzberg · Eran Halperin 🔗 Fri 1:00 p.m. - 2:00 p.m. Inferring mood disorder symptoms from multivariate time-series sensory data (Poster)  link » Mood disorders are increasingly recognized among the leading causes of disease burden worldwide. Depressive and manic episodes in mood disorders commonly involve altered mood, sleep, and motor activity. These translate to changes in sensory data that wearable devices can continuously and affordably monitor, thereby positioning themselves as promising candidate to model mood disorders. Previous similar endeavors cast this problem in terms of binary classification (cases vs controls) or regress the total score of some commonly used psychometric scale. Nevertheless, these approaches fail to capture the variability within symptom domains described at the item level in psychometric scales. In this work, we attempt to infer mood disorder symptoms (e.g., depressed mood, insomnia, irritability) from time-series data collected with the medical grade Empatica E4 wristbands, as part of an exploratory, observational, and longitudinal study. We propose a multi-label framework to predict individual items from the two most widely used scales for assessing depression and mania. We experiment with two different approaches to preprocess the high-dimensional and noisy sensory data and attain results within a clinically acceptable level of error. Link » Bryan Li · Filippo Corponi · Gerard Anmella · Ariadna Mas Musons · Miriam Sanabra · Diego Hidalgo-Mazzei · Antonio Vergari 🔗 Fri 1:00 p.m. - 2:00 p.m. Contrastive Learning of Electrodermal Activity Representations for Stress Detection (Poster)  link » Electrodermal activity (EDA), usually measured as skin conductance, is a biosignal that contains valuable information for health monitoring. However, building machine learning models utilizing EDA data is challenging because EDA measurements tend to be noisy and sparsely labelled. To address this problem, we investigate applying contrastive learning to EDA. The EDA signal presents different challenges than the domains to which contrastive learning is usually applied (e.g., text and images). In particular, EDA is non-stationary and subject to specific kinds of noise. In this study, we focus on designing contrastive learning methods that are tailored to EDA data. We propose novel transformations of EDA signals to produce sets of positive examples within a contrastive learning framework. We evaluate our proposed approach on the downstream task of stress detection. We find that the embeddings learned with our contrastive pre-training approach outperform baselines, including fully supervised methods. Link » Katie Matton · Robert Lewis · John Guttag · Rosalind Picard 🔗 Fri 1:00 p.m. - 2:00 p.m. Contrastive Pre-Training for Multimodal Medical Time Series (Poster)  link » Clinical time series data are highly rich and provide significant information about a patient's physiological state. However, these time series can be complex to model, particularly when they consist of multimodal data measured at different resolutions. Most existing methods to learn representations of these data consider only tabular time series (e.g., lab measurements and vitals signs), and do not naturally extend to modelling a full, multimodal time series. In this work, we propose a contrastive pre-training strategy to learn representations of multimodal time series. We consider a setting where the time series contains sequences of (1) high-frequency electrocardiograms and (2) structured data from labs and vitals. We outline a strategy to generate augmentations of these data for contrastive learning, building on recent work in representation learning for medical data. We evaluate our method on a real-world dataset, finding it obtains improved or competitive performance when compared to baselines on two downstream tasks. Link » Aniruddh Raghu · Payal Chandak · Ridwan Alam · John Guttag · Collin Stultz 🔗 Fri 1:00 p.m. - 2:00 p.m. Identifying Structure in the MIMIC ICU Dataset (Poster)  link » The MIMIC-III dataset, containing trajectories of 40,000 ICU patients, is one of the most popular datasets in machine learning for health space. However, there has been very little systematic exploration to understand what is the natural structure of these data---most analyses enforce some type of top-down clustering or embedding. We take a bottom-up approach, identifying consistent structures that are robust across a range of embedding choices. We identified two dominant structures sorted by either fraction-inspired oxygen or creatinine --- both of which were validated as the key features by our clinical co-author. Our bottom-up approach in studying the macro-structure of a dataset can also be adapted for other datasets. Link » Qi Qi Chin 🔗 Fri 1:00 p.m. - 2:00 p.m. Improving Counterfactual Explanations for Time Series Classification Models in Healthcare Settings (Poster)  link » Explanations of machine learning models' decisions can help build trust as well as identify and isolate unexpected model behavior. Time series data, abundant in medical applications, and their associated classifiers pose a particularly difficult explainability problem due to the inherent feature dependency that results in complex modeling decisions and assumptions. Counterfactual explanations for a given time series tells the user how the input to the model needs to change in order to receive a different class prediction from the classifier. While a few methods for generating counterfactual explanations for time series have been proposed, the needs of simplicity and plausibilty have been overlooked. In this paper, we propose an easily understood method to generate realistic counterfactual explanations for any black box time series model. Our method, Shapelet-Guided Realistic Counterfactual Explanation Generation for Black-Box Time Series Classifiers (SGRCEG), grounds the search for counterfactual explanations in shapelets, which are discriminatory subsequences in time series.SGRCEG greedily constructs counterfactual explanations based on shapelets. Additionally, SGRCEG also employs a realism check, so the likelihood of producing a counterfactual that is not plausible is minimized. Using SGRCEG, model developers as well as medical practitioners can better understand the decisions of their models. Link » Tina Han · Jette Henderson · Pedram Akbarian Saravi · Joydeep Ghosh 🔗 Fri 1:00 p.m. - 2:00 p.m. Contactless Oxygen Monitoring with Gated Transformer (Poster)  link » With the increasing popularity of telehealth, it becomes critical to ensure that basic physiological signals can be monitored accurately at home, with minimal patient overhead. In this paper, we propose a contactless approach for monitoring patients' blood oxygen at home, simply by analyzing the radio signals in the room, without any wearable devices. We extract the patients' respiration from the radio signals that bounce off their bodies and devise a novel neural network that infers a patient's oxygen estimates from their breathing signal. Our model, called Gated BERT-UNet, is designed to adapt to the patient's medical indices (e.g., gender, sleep stages). It has multiple predictive heads and selects the most suitable head via a gate controlled by the person's physiological indices. Extensive empirical results show that our model achieves high accuracy on both medical and radio datasets. Link » Hao He · Yuan Yuan · Yingcong Chen · Peng Cao · Dina Katabi 🔗 Fri 1:00 p.m. - 2:00 p.m. Dynamic Survival Transformers for Causal Inference with Electronic Health Records (Poster)  link » In medicine, researchers often seek to infer the effects of a given treatment on patients' outcomes, such as the expected time until infection. However, the standard methods for causal survival analysis make simplistic assumptions about the data-generating process and cannot capture complex interactions among patient covariates. We introduce the Dynamic Survival Transformer (DynST), a deep survival model that trains on electronic health records (EHRs). Unlike previous transformers used in survival analysis, DynST can make use of time-varying information to predict evolving survival probabilities. We derive a semi-synthetic EHR dataset from MIMIC-III to show that DynST can accurately estimate the causal effect of a treatment intervention on restricted mean survival time (RMST). We demonstrate that DynST achieves better predictive and causal estimation than two alternative models. Link » Prayag Chatha · Yixin Wang · Zhenke Wu · Jeffrey Regier 🔗 Fri 1:00 p.m. - 2:00 p.m. Modeling Heart Rate Response to Exercise with Wearables Data (Poster)  link » Heart rate (HR) dynamics in response to workout intensity measure key aspects of an individual's fitness and cardiorespiratory health. Models of exercise physiology have been used to characterize cardiorespiratory fitness in well-controlled laboratory settings, but face additional challenges when applied to wearables in noisy, real-world settings. Here, we introduce a hybrid machine learning model that combines a physiological model of HR during exercise with complex neural networks in order to learn user-specific fitness representations. We apply this model at scale to a large set of workout data collected with wearables and show that it can accurately predict HR response to exercise demand in new workouts. We further show that the learned embeddings correlate with traditional metrics of cardiorespiratory fitness. Lastly, we illustrate how our model naturally incorporates and learn the effects of environmental factors such as temperature and humidity. Link » Achille Nazaret · Sana Tonekaboni · Gregory Darnell · Shirley Ren · Guillermo Sapiro · Andrew Miller 🔗 Fri 1:00 p.m. - 2:00 p.m. Adversarial Masking for Pretraining ECG Data Improves Downstream Model Generalizability (Poster)  link » Medical datasets often face the problem of data scarcity, as ground truth labels must be generated by medical professionals. One mitigation strategy is to pretrain deep learning models on large, unlabelled datasets with self-supervised learning (SSL), but this introduces the issue of domain shift if the pretraining and task dataset distributions differ. Data augmentations are essential for improving the generalizability of SSL-pretrained models, but they tend to be either handcrafted or randomly applied. We use an adversarial model to generate masks as augmentations for 12-lead electrocardiogram (ECG) data, where masks learn to occlude diagnostically-relevant regions. Compared to random augmentations, adversarial masking reaches better accuracy on a downstream arrhythmia classification task under a domain shift condition and in data-scarce regimes. Adversarial masking is competitive with, and even reaches further improvements when combined with state-of-art ECG augmentation methods, 3KG and random lead masking. Link » Jessica Yi Fei Bo · Hen-Wei Huang · Alvin Chan · Giovanni Traverso 🔗 Fri 1:00 p.m. - 2:00 p.m. Masked Autoencoder-Based Self-Supervised Learning for Electrocardiograms to Detect Left Ventricular Systolic Dysfunction. (Poster)  link » The generalization of deep neural network algorithms to a broader population is an important challenge in the medical field. In this study, we aimed to apply self-supervised learning using masked autoencoders (MAEs) to improve the performance of deep learning models that detect left ventricular systolic dysfunction (LVSD) from 12-lead electrocardiography data. In our MAE approach, we first mask the vast majority, that is, 75% of the ECG time series. Second, we pretrain a Vision Transformer encoder by inferring the masked part. Our proposed approach enables rich features that generalize well from unlabeled ECG data to be learned. In fact, the reconstructed ECG maintains the relationships among the major ECG components. Transfer performance in the detection of LVSD outperforms the baseline CNN model on external validation datasets and shows promising results for generalization that enables us to use the model for a broader population by solely using ECG data collected in a single medical institution. Link » Shinnosuke Sawano · Satoshi Kodera · Hirotoshi Takeuchi · Issei Sukeda · Susumu Katsushika · Issei Komuro 🔗 Fri 1:00 p.m. - 2:00 p.m. sEHR-CE: Language modelling of structured EHR data for efficient and generalizable patient cohort expansion (Poster)  link » Electronic health records (EHR) offer unprecedented opportunities for in-depth clinical phenotyping and prediction of clinical outcomes. Combining multiple data sources is crucial to generate a complete picture of disease prevalence, incidence and trajectories. The standard approach to combining clinical data involves collating clinical terms across different terminology systems using curated maps, which are often inaccurate and/or incomplete. Here, we propose sEHR-CE, a novel framework based on transformers to enable integrated phenotyping and analyses of heterogeneous clinical datasets without relying on these mappings. We unify clinical terminologies using textual descriptors of concepts, and represent individuals’ EHR as sections of text. We then fine-tune pre-trained language models to predict disease phenotypes more accurately than non-text and single terminology approaches. We validate our approach using primary and secondary care data from the UK Biobank, a large-scale research study. Finally, we illustrate in a type 2 diabetes use case how sEHR-CE identifies individuals without diagnosis that share clinical characteristics with patients. Link » Anna Munoz-Farre · Harry Rose · Aylin Cakiroglu 🔗 Fri 1:00 p.m. - 2:00 p.m. FastCPH: Efficient Survival Analysis for Neural Networks (Poster)  link » The Cox proportional hazards model is a canonical method in survival analysis for prediction of the life expectancy of a patient given clinical or genetic covariates -- it is a linear model in its original form. In recent years, several methods have been proposed to generalize the Cox model to neural networks, but none of these are both numerically correct and computationally efficient. We propose FastCPH, a new method that runs in linear time and supports both the standard Breslow and Efron methods for tied events. We also demonstrate the performance of FastCPH combined with LassoNet, a neural network that provides interpretability through feature sparsity, on survival datasets. The final procedure is efficient, selects useful covariates and outperforms existing CoxPH approaches. Link » Xuelin Yang · Louis F Abraham · Sejin Kim · Petr Smirnov · Feng Ruan · Benjamin Haibe-Kains · Robert Tibshirani 🔗 Fri 1:00 p.m. - 2:00 p.m. Modeling MRSA decolonization: Interactions between body sites and the impact of site-specific clearance (Poster)  link » MRSA colonization is a critical public health concern. Decolonization protocols have been designed for the clearance of MRSA. Successful decolonization protocols reduce disease incidence; however, multiple protocols exist, comprising diverse therapies targeting multiple body sites, and the optimal protocol is unclear. Here, we formulate a machine learning model using data from a randomized controlled trial (RCT) of MRSA decolonization, which estimates interactions between body sites, quantifies the contribution of each therapy to successful decolonization, and enables predictions of the efficacy of therapy combinations. This work shows how a machine learning model can help design and improve complex clinical protocols. Link » 🔗 Fri 1:00 p.m. - 2:00 p.m. A Temporal Fusion Transformer for Long-term Explainable Prediction of Emergency Department Overcrowding (Poster)  link » Emergency Departments (EDs) are a fundamental element of the Portuguese National Health Service, serving as an entry point for users with diverse and severe medical problems. Due to the inherent characteristics of the ED, forecasting the number of patients using the services is particularly challenging. And a mismatch between affluence and the number of medical professionals can lead to a decrease in the quality of the services provided and create problems that have repercussions for the entire hospital, with the requisition of health care workers from other departments and the postponement of surgeries. ED overcrowding is driven, in part, by non-urgent patients that resort to emergency services despite not having a medical emergency, representing almost half of the total number of daily patients. This paper describes a novel deep learning architecture, the Temporal Fusion Transformer, that uses calendar and time-series covariates to forecast prediction intervals and point predictions for a 4-week period. We have concluded that patient volume can be forecasted with a Mean Absolute Percentage Error (MAPE) of 9.87% for Portugal’s Health Regional Areas (HRA) and a Root Mean Squared Error (RMSE) of 178 people/day. The paper shows empirical evidence supporting the use of a multivariate approach with static and time-series covariates while surpassing other models commonly found in the literature. Link » 🔗 Fri 2:00 p.m. - 3:00 p.m. Panel Discussion: Challenges and lessons learned in deploying ML time series models (Discussion Panel) 🔗 Fri 2:45 p.m. - 3:00 p.m. Closing Remarks (In-person remarks) 🔗