计算机科学
疾病
正规化(语言学)
相似性(几何)
高斯分布
二部图
矩阵范数
核基质
人工智能
计算生物学
数据挖掘
机器学习
数学
模式识别(心理学)
医学
生物
理论计算机科学
特征向量
化学
病理
遗传学
图形
计算化学
物理
量子力学
图像(数学)
DNA
染色质
作者
Haiyan Liu,Pingping Bing,Meijun Zhang,Geng Tian,Jun Ma,Haigang Li,Meihua Bao,Kunhui He,Jianjun He,Binsheng He,Jialiang Yang
标识
DOI:10.1016/j.csbj.2022.12.053
摘要
Identifying the potential associations between microbes and diseases is the first step for revealing the pathological mechanisms of microbe-associated diseases. However, traditional culture-based microbial experiments are expensive and time-consuming. Thus, it is critical to prioritize disease-associated microbes by computational methods for further experimental validation. In this study, we proposed a novel method called MNNMDA, to predict microbe-disease associations (MDAs) by applying a Matrix Nuclear Norm method into known microbe and disease data. Specifically, we first calculated Gaussian interaction profile kernel similarity and functional similarity for diseases and microbes. Then we constructed a heterogeneous information network by combining the integrated disease similarity network, the integrated microbe similarity network and the known microbe-disease bipartite network. Finally, we formulated the microbe-disease association prediction problem as a low-rank matrix completion problem, which was solved by minimizing the nuclear norm of a matrix with a few regularization terms. We tested the performances of MNNMDA in three datasets including HMDAD, Disbiome, and Combined Data with small, medium and large sizes respectively. We also compared MNNMDA with 5 state-of-the-art methods including KATZHMDA, LRLSHMDA, NTSHMDA, GATMDA, and KGNMDA, respectively. MNNMDA achieved area under the ROC curves (AUROC) of 0.9536 and 0.9364 respectively on HDMAD and Disbiome, better than the AUCs of compared methods under the 5-fold cross-validation for all microbe-disease associations. It also obtained a relatively good performance with AUROC 0.8858 in the combined data. In addition, MNNMDA was also better than other methods in area under precision and recall curve (AUPR) under the 5-fold cross-validation for all associations, and in both AUROC and AUPR under the 5-fold cross-validation for diseases and the 5-fold cross-validation for microbes. Finally, the case studies on colon cancer and inflammatory bowel disease (IBD) also validated the effectiveness of MNNMDA. In conclusion, MNNMDA is an effective method in predicting microbe-disease associations. The codes and data for this paper are freely available at Github https://github.com/Haiyan-Liu666/MNNMDA.
科研通智能强力驱动
Strongly Powered by AbleSci AI