Quantitative omnigenic model discovers interpretable genome-wide associations

全基因组关联研究表达数量性状基因座数量性状位点特质计算生物学基因调控网络生物差异（会计）遗传关联基因组协方差基因统计能力遗传学统计模型进化生物学计算机科学基因表达统计数学机器学习单核苷酸多态性基因型业务会计程序设计语言

作者

Natália Ružičková,Michal Hledík,Gašper Tkačik

标识

DOI：10.1101/2024.02.01.578486

摘要

Abstract As their statistical power grows, genome-wide association studies (GWAS) have identified an increasing number of loci underlying quantitative traits of interest. These loci are scattered throughout the genome and are individually responsible only for small fractions of the total heritable trait variance. The recently proposed omnigenic model provides a conceptual framework to explain these observations by postulating that numerous distant loci contribute to each complex trait via effect propagation through intracellular regulatory networks. We formalize this conceptual framework by proposing the “quantitative omnigenic model” (QOM), a statistical model that combines prior knowledge of the regulatory network topology with genomic data. By applying our model to gene expression traits in yeast, we demonstrate that QOM achieves similar gene expression prediction performance to traditional GWAS with hundreds of times less parameters, while simultaneously extracting candidate causal and quantitative chains of effect propagation through the regulatory network for every individual gene. We estimate the fraction of heritable trait variance in cis- and in trans- , break the latter down by effect propagation order, assess the trans- variance not attributable to transcriptional regulation, and show that QOM correctly accounts for the low-dimensional structure of gene expression covariance. We furthermore demonstrate the relevance of QOM for systems biology, by employing it as a statistical test for the quality of regulatory network reconstructions, and linking it to the propagation of non-transcriptional (including environmental) effects. Significance statement Genetic variation leads to differences in traits implicated in health and disease. Identifying genetic variants associated with these traits is spearheaded by “genome-wide association studies” (GWAS) – statistically rigorous procedures whose power has grown with the number of genotyped samples. Nevertheless, GWAS have a substantial shortcoming: they are ill-equipped to detect the causal basis and reveal the complex systemic mechanisms of polygenic traits. Even a single genetic change can propagate throughout the entire genetic regulatory network causing a myriad of spurious detections, thereby significantly limiting GWAS usefulness. To this end, we propose a novel statistical approach that incorporates known regulatory network information with the potential to boost interpretability of state-of-the-art genomic analyses while simultaneously extracting systems biology insights.

求助该文献

Quantitative omnigenic model discovers interpretable genome-wide associations

今日热心研友