Predicting Influenza Reassortment Potential Using Foundation Models and Genetic Algorithms for Pandemic Preparedness
Abstract
Influenza A virus (IAV) poses a persistent global threat due to its rapid evolution through reassort- ment, which hampers vaccine design, antiviral development, and causes recurring outbreaks. We present a novel computational framework that combines DNABERT-2, a foundation model for genomic sequences, with genetic algorithms to predict reassortment events. Using environmen- tal surveillance data, our method identifies both known and hypothetical reassortants and scores them based on biological plausibility. Dimension- ality reduction reveals clear separation between reassortant and non-reassortant embeddings. This enables early detection of high-risk strains, of- fering a scalable tool for pandemic preparedness. Our approach supports proactive strategies for vaccine, therapeutic, and economic resilience.