Skip to yearly menu bar Skip to main content

Spotlight Talk
Workshop: Machine Learning for Mobile Health

A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data

Kathy Li


Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for statistical modeling. However, such data streams are notoriously unreliable since they hinge on user adherence to the app. Thus, it is crucial for machine learning models to account for self-tracking artifacts like skipped self-tracking. In this abstract, we propose and evaluate a hierarchical, generative model for predicting next cycle length based on previously tracked cycle lengths that accounts explicitly for the possibility of users forgetting to track their period. Our model offers several advantages: 1) accounting explicitly for self-tracking artifacts yields better prediction accuracy as likelihood of skipping increases; 2) as a generative model, predictions can be updated online as a given cycle evolves; and 3) its hierarchical nature enables modeling of an individual's cycle length history while incorporating population-level information. Our experiments using real mHealth cycle length data from 5,000 menstruators show that our method yields state-of-the-art performance against neural network-based and summary statistic-based baselines.