IPCARF: improving lncRNA-disease association prediction using incremental principal component analysis feature selection and a random forest classifier

随机森林 主成分分析 特征选择 计算机科学 人工智能 模式识别(心理学) 机器学习 交叉验证 分类器(UML) 相似性(几何) 支持向量机 数据挖掘 降维 图像(数学)
作者
Rong Zhu,Yong Wang,Jin‐Xing Liu,Lingyun Dai
出处
期刊:BMC Bioinformatics [BioMed Central]
卷期号:22 (1) 被引量:32
标识
DOI:10.1186/s12859-021-04104-9
摘要

Abstract Background Identifying lncRNA-disease associations not only helps to better comprehend the underlying mechanisms of various human diseases at the lncRNA level but also speeds up the identification of potential biomarkers for disease diagnoses, treatments, prognoses, and drug response predictions. However, as the amount of archived biological data continues to grow, it has become increasingly difficult to detect potential human lncRNA-disease associations from these enormous biological datasets using traditional biological experimental methods. Consequently, developing new and effective computational methods to predict potential human lncRNA diseases is essential. Results Using a combination of incremental principal component analysis (IPCA) and random forest (RF) algorithms and by integrating multiple similarity matrices, we propose a new algorithm (IPCARF) based on integrated machine learning technology for predicting lncRNA-disease associations. First, we used two different models to compute a semantic similarity matrix of diseases from a directed acyclic graph of diseases. Second, a characteristic vector for each lncRNA-disease pair is obtained by integrating disease similarity, lncRNA similarity, and Gaussian nuclear similarity. Then, the best feature subspace is obtained by applying IPCA to decrease the dimension of the original feature set. Finally, we train an RF model to predict potential lncRNA-disease associations. The experimental results show that the IPCARF algorithm effectively improves the AUC metric when predicting potential lncRNA-disease associations. Before the parameter optimization procedure, the AUC value predicted by the IPCARF algorithm under 10-fold cross-validation reached 0.8529; after selecting the optimal parameters using the grid search algorithm, the predicted AUC of the IPCARF algorithm reached 0.8611. Conclusions We compared IPCARF with the existing LRLSLDA, LRLSLDA-LNCSIM, TPGLDA, NPCMF, and ncPred prediction methods, which have shown excellent performance in predicting lncRNA-disease associations. The compared results of 10-fold cross-validation procedures show that the predictions of the IPCARF method are better than those of the other compared methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
熊熊之火完成签到,获得积分10
刚刚
微笑的念烟完成签到,获得积分20
1秒前
科研小民工应助positive采纳,获得30
1秒前
星海殇完成签到 ,获得积分0
1秒前
有星星的小路完成签到,获得积分10
1秒前
NY完成签到,获得积分10
2秒前
sun发布了新的文献求助10
2秒前
2秒前
JamesPei应助我是陌生人采纳,获得10
2秒前
3秒前
偶然发现的西柚完成签到 ,获得积分10
3秒前
4秒前
球球了发布了新的文献求助20
4秒前
日天的马铃薯完成签到,获得积分10
4秒前
科研小民工应助阿彤采纳,获得30
5秒前
小野发布了新的文献求助10
5秒前
songyueyue发布了新的文献求助30
5秒前
6秒前
呆萌代桃发布了新的文献求助10
6秒前
明亮无颜发布了新的文献求助20
7秒前
大晨发布了新的文献求助10
7秒前
7秒前
7秒前
毒品完成签到,获得积分20
7秒前
林木秦完成签到,获得积分10
7秒前
ding应助糕糕采纳,获得10
8秒前
9秒前
LL发布了新的文献求助10
9秒前
rrrrrr发布了新的文献求助10
11秒前
所所应助HE采纳,获得10
11秒前
温柔的代曼完成签到,获得积分20
12秒前
朱孝培完成签到,获得积分10
12秒前
复杂汲完成签到 ,获得积分10
12秒前
李爱国应助优雅冰双采纳,获得10
13秒前
LL完成签到,获得积分10
13秒前
明天更好发布了新的文献求助10
13秒前
14秒前
飞跃完成签到,获得积分10
14秒前
科研通AI5应助科研通管家采纳,获得10
14秒前
LiuShenglan完成签到,获得积分10
14秒前
高分求助中
All the Birds of the World 3000
Weirder than Sci-fi: Speculative Practice in Art and Finance 960
IZELTABART TAPATANSINE 500
Introduction to Comparative Public Administration: Administrative Systems and Reforms in Europe: Second Edition 2nd Edition 300
Spontaneous closure of a dural arteriovenous malformation 300
Not Equal : Towards an International Law of Finance 260
Oribatid mites in Burmese amber I. First record of the family Achipteriidae (Acariformes, Oribatida) in Cretaceous amber, with the description of a new species of Cerachipteria Grandjean, 1935 250
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3725821
求助须知:如何正确求助?哪些是违规求助? 3270855
关于积分的说明 9969218
捐赠科研通 2986238
什么是DOI,文献DOI怎么找? 1638149
邀请新用户注册赠送积分活动 777978
科研通“疑难数据库(出版商)”最低求助积分说明 747365