Revealing third-order interactions through the integration of machine learning and entropy methods in genomic studies

全基因组关联研究 计算机科学 上位性 单核苷酸多态性 遗传关联 SNP公司 计算生物学 人工智能 数据挖掘 机器学习 生物 遗传学 基因型 基因
作者
Burcu Yaldız,Onur Erdoğan,Sevda Rafatov,Cem İyigün,Yeşim Aydın Son
出处
期刊:Biodata Mining [BioMed Central]
卷期号:17 (1) 被引量:1
标识
DOI:10.1186/s13040-024-00355-3
摘要

Abstract Background Non-linear relationships at the genotype level are essential in understanding the genetic interactions of complex disease traits. Genome-wide association Studies (GWAS) have revealed statistical association of the SNPs in many complex diseases. As GWAS results could not thoroughly reveal the genetic background of these disorders, Genome-Wide Interaction Studies have started to gain importance. In recent years, various statistical approaches, such as entropy-based methods, have been suggested for revealing these non-additive interactions between variants. This study presents a novel prioritization workflow integrating two-step Random Forest (RF) modeling and entropy analysis after PLINK filtering. PLINK-RF-RF workflow is followed by an entropy-based 3-way interaction information (3WII) method to capture the hidden patterns resulting from non-linear relationships between genotypes in Late-Onset Alzheimer Disease to discover early and differential diagnosis markers. Results Three models from different datasets are developed by integrating PLINK-RF-RF analysis and entropy-based three-way interaction information (3WII) calculation method, which enables the detection of the third-order interactions, which are not primarily considered in epistatic interaction studies. A reduced SNP set is selected for all three datasets by 3WII analysis by PLINK filtering and prioritization of SNP with RF-RF modeling, promising as a model minimization approach. Among SNPs revealed by 3WII, 4 SNPs out of 19 from GenADA, 1 SNP out of 27 from ADNI, and 4 SNPs out of 106 from NCRAD are mapped to genes directly associated with Alzheimer Disease. Additionally, several SNPs are associated with other neurological disorders. Also, the genes the variants mapped to in all datasets are significantly enriched in calcium ion binding, extracellular matrix, external encapsulating structure, and RUNX1 regulates estrogen receptor-mediated transcription pathways. Therefore, these functional pathways are proposed for further examination for a possible LOAD association. Besides, all 3WII variants are proposed as candidate biomarkers for the genotyping-based LOAD diagnosis. Conclusion The entropy approach performed in this study reveals the complex genetic interactions that significantly contribute to LOAD risk. We benefited from the entropy-based 3WII as a model minimization step and determined the significant 3-way interactions between the prioritized SNPs by PLINK-RF-RF. This framework is a promising approach for disease association studies, which can also be modified by integrating other machine learning and entropy-based interaction methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
fanfan完成签到 ,获得积分10
刚刚
3秒前
ilk666完成签到,获得积分10
5秒前
105完成签到,获得积分10
9秒前
23发布了新的文献求助10
9秒前
9秒前
子皿完成签到 ,获得积分10
12秒前
852应助ilk666采纳,获得10
13秒前
guandada发布了新的文献求助10
16秒前
可爱的函函应助tangz采纳,获得10
16秒前
19秒前
虎虎关注了科研通微信公众号
22秒前
该房地产个人的完成签到,获得积分10
22秒前
乐乐是一只大黄面包完成签到,获得积分10
25秒前
严锦强完成签到,获得积分10
26秒前
weiwei完成签到,获得积分10
26秒前
26秒前
Levi完成签到,获得积分10
27秒前
29秒前
cdercder应助xf采纳,获得10
30秒前
Agamemnon完成签到,获得积分10
31秒前
32秒前
念梦完成签到,获得积分10
33秒前
33秒前
可可发布了新的文献求助10
33秒前
在水一方应助粒子采纳,获得10
34秒前
rputation完成签到 ,获得积分10
36秒前
37秒前
37秒前
38秒前
山姆弟弟完成签到 ,获得积分10
39秒前
阿南完成签到 ,获得积分10
39秒前
40秒前
宇文紫真发布了新的文献求助30
42秒前
斯文麦片完成签到 ,获得积分10
43秒前
斯文败类应助於如风采纳,获得10
44秒前
44秒前
超级李包包完成签到,获得积分10
46秒前
体贴坤坤完成签到 ,获得积分10
46秒前
Archy发布了新的文献求助10
47秒前
高分求助中
Continuum Thermodynamics and Material Modelling 2000
Neuromuscular and Electrodiagnostic Medicine Board Review 1000
こんなに痛いのにどうして「なんでもない」と医者にいわれてしまうのでしょうか 510
いちばんやさしい生化学 500
Genre and Graduate-Level Research Writing 500
The First Nuclear Era: The Life and Times of a Technological Fixer 500
岡本唐貴自伝的回想画集 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3675273
求助须知:如何正确求助?哪些是违规求助? 3230125
关于积分的说明 9788992
捐赠科研通 2940956
什么是DOI,文献DOI怎么找? 1612268
邀请新用户注册赠送积分活动 761065
科研通“疑难数据库(出版商)”最低求助积分说明 736596