Exploring Generative Approaches for Predicting Copolymer Sequences from Reaction Conditions
Abstract
Precise control over monomer sequences in synthetic copolymers is essential for tailoring material properties but remains challenging due to the complexity of polymerization processes. Simulation studies have provided valuable insights into how individual factors influence sequence formation, yet they often examine parameters in isolation and fail to capture their combined effects. Previous applications in polymer sequence design and reaction optimization have proved that machine learning can efficiently navigate complex parameter spaces and accelerate discovery, which is expected to advance the understanding and control of sequence during copolymerization reactions. In this work, we propose a unified conditional block-length distribution generation model to capture the characterization features of polymer sequences, PolyGen. Using simulation datasets, we demonstrate that PolyGen can accurately predict copolymer block-length distributions in most cases under diverse chemical and physical conditions, including monomer interactions, chain stiffness, activation energy, monomer density, and solvent viscosity. By linking synthesis parameters with sequence outcomes, PolyGen establishes a new machine learning–based approach for investigating and guiding the design of sequence-controlled polymers, thereby accelerating their study and potential applications.