自编码
随机森林
计算机科学
小RNA
计算生物学
疾病
人工智能
源代码
机器学习
数据挖掘
医学
生物
人工神经网络
病理
遗传学
基因
操作系统
作者
Qiuying Dai,Yanyi Chu,Zhiqi Li,Yusong Zhao,Xueying Mao,Yanjing Wang,Yi Xiong,Dong-Qing Wei
标识
DOI:10.1016/j.compbiomed.2021.104706
摘要
MicroRNAs (miRNAs) are significant regulators in various biological processes. They may become promising biomarkers or therapeutic targets, which provide a new perspective in diagnosis and treatment of multiple diseases. Since the experimental methods are always costly and resource-consuming, prediction of disease-related miRNAs using computational methods is in great need. In this study, we developed MDA-CF to identify underlying miRNA-disease associations based on a cascade forest model. In this method, multi-source information was integrated to represent miRNAs and diseases comprehensively, and the autoencoder was utilized for dimension reduction to obtain the optimal feature space. The cascade forest model was then employed for miRNA-disease association prediction. As a result, the average AUC of MDA-CF was 0.9464 on HMDD v3.2 in five-fold cross-validation. Compared with previous computational methods, MDA-CF performed better on HMDD v2.0 with an average AUC of 0.9258. Moreover, MDA-CF was implemented to investigate colon neoplasm, breast neoplasm, and gastric neoplasm, and 100%, 86%, 88% of the top 50 potential miRNAs were validated by authoritative databases. In conclusion, MDA-CF appears to be a reliable method to uncover disease-associated miRNAs. The source code of MDA-CF is available at https://github.com/a1622108/MDA-CF . • MDA-CF is developed for miRNA-disease association prediction using cascade forest. • Multiple source of information is combined to represent miRNAs and diseases. • The autoencoder is utilized to obtain representative feature space. • MDA-CF combines the bagging method random forest and the boosting method xgboost.
科研通智能强力驱动
Strongly Powered by AbleSci AI