增强子
生物
序列母题
黑腹果蝇
计算生物学
遗传学
调节顺序
DNA
基因
序列(生物学)
转录因子
作者
Bernardo P. de Almeida,Franziska Reiter,Michaela Pagani,Alexander Stark
出处
期刊:Nature Genetics
[Springer Nature]
日期:2022-05-01
卷期号:54 (5): 613-624
被引量:134
标识
DOI:10.1038/s41588-022-01048-5
摘要
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo.
科研通智能强力驱动
Strongly Powered by AbleSci AI