孟德尔随机化
生物标志物
接收机工作特性
随机森林
生物标志物发现
计算生物学
微阵列分析技术
机器学习
支持向量机
特征选择
基因
医学
基因表达
生物信息学
人工智能
生物
遗传学
计算机科学
基因型
蛋白质组学
遗传变异
作者
Yidong Zhu,Jun Liu,Bo Wang
摘要
Abstract Aim To identify potential biomarkers and explore the mechanisms underlying diabetic nephropathy (DN) by integrating machine learning, Mendelian randomization (MR) and experimental validation. Methods Microarray and RNA‐sequencing datasets (GSE47184, GSE96804, GSE104948, GSE104954, GSE142025 and GSE175759) were obtained from the Gene Expression Omnibus database. Differential expression analysis identified the differentially expressed genes (DEGs) between patients with DN and controls. Diverse machine learning algorithms, including least absolute shrinkage and selection operator, support vector machine‐recursive feature elimination, and random forest, were used to enhance gene selection accuracy and predictive power. We integrated summary‐level data from genome‐wide association studies on DN with expression quantitative trait loci data to identify genes with potential causal relationships to DN. The predictive performance of the biomarker gene was validated using receiver operating characteristic (ROC) curves. Gene set enrichment and correlation analyses were conducted to investigate potential mechanisms. Finally, the biomarker gene was validated using quantitative real‐time polymerase chain reaction in clinical samples from patients with DN and controls. Results Based on identified 314 DEGs, seven characteristic genes with high predictive performance were identified using three integrated machine learning algorithms. MR analysis revealed 219 genes with significant causal effects on DN, ultimately identifying one co‐expressed gene, carbonic anhydrase II ( CA2 ), as a key biomarker for DN. The ROC curves demonstrated the excellent predictive performance of CA2 , with area under the curve values consistently above 0.878 across all datasets. Additionally, our analysis indicated a significant association between CA2 and infiltrating immune cells in DN, providing potential mechanistic insights. This biomarker was validated using clinical samples, confirming the reliability of our findings in clinical practice. Conclusion By integrating machine learning, MR and experimental validation, we successfully identified and validated CA2 as a promising biomarker for DN with excellent predictive performance. The biomarker may play a role in the pathogenesis and progression of DN via immune‐related pathways. These findings provide important insights into the molecular mechanisms underlying DN and may inform the development of personalized treatment strategies for this disease.
科研通智能强力驱动
Strongly Powered by AbleSci AI