上位性
进化生物学
生物
计算生物学
统计物理学
计量经济学
遗传学
数学
物理
基因
作者
Leonardo Di Bari,Matteo Bisardi,Sabrina Cotogno,Martin Weigt,Francesco Zamponi
出处
期刊:Cornell University - arXiv
日期:2024-03-14
标识
DOI:10.48550/arxiv.2403.09436
摘要
We introduce a data-driven epistatic model of protein evolution, capable of generating evolutionary trajectories spanning very different time scales reaching from individual mutations to diverged homologs. Our in silico evolution encompasses random nucleotide mutations, insertions and deletions, and models selection using a fitness landscape, which is inferred via a generative probabilistic model for protein families. We show that the proposed framework accurately reproduces the sequence statistics of both short-time (experimental) and long-time (natural) protein evolution, suggesting applicability also to relatively data-poor intermediate evolutionary time scales, which are currently inaccessible to evolution experiments. Our model uncovers a highly collective nature of epistasis, gradually changing the fitness effect of mutations in a diverging sequence context, rather than acting via strong interactions between individual mutations. This collective nature triggers the emergence of a long evolutionary time scale, separating fast mutational processes inside a given sequence context, from the slow evolution of the context itself. The model quantitatively reproduces the extent of contingency and entrenchment, as well as the loss of predictability in protein evolution observed in deep mutational scanning experiments of distant homologs. It thereby deepens our understanding of the interplay between mutation and selection in shaping protein diversity and novel functions, allows to statistically forecast evolution, and challenges the prevailing independent-site models of protein evolution, which are unable to capture the fundamental importance of epistasis.
科研通智能强力驱动
Strongly Powered by AbleSci AI