Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)
regLM: Designing realistic regulatory DNA with autoregressive language models
Avantika Lal · Tommaso Biancalani · Gokcen Eraslan
Keywords: [ enhancer design ] [ Autoregressive language modeling ] [ GPT ] [ CRE design ] [ generative sequence modeling ] [ DNA sequence modeling ] [ hyenaDNA ]
Designing cis-regulatory DNA elements (CREs) with desired properties is a challenging task with many therapeutic applications. Here, we used autoregressive language models trained on yeast and human putative CREs, in conjunction with supervised sequence-to-function models, to design regulatory DNA with desired patterns of activity. We showed that our framework, regLM, compares favorably to existing design approaches. regLM facilitates the design of realistic and diverse regulatory DNA while providing insights into the cis-regulatory code.