Improving compound–protein interaction prediction by building up highly credible negative samples

生物信息学 计算机科学 机器学习 计算生物学 贝叶斯概率 人工智能 集合(抽象数据类型) 数据挖掘 生物 遗传学 基因 程序设计语言
作者
Hui Liu,Jianjiang Sun,Jihong Guan,Jie Zheng,Shuigeng Zhou
出处
期刊:Bioinformatics [Oxford University Press]
卷期号:31 (12): i221-i229 被引量:241
标识
DOI:10.1093/bioinformatics/btv256
摘要

Abstract Motivation: Computational prediction of compound–protein interactions (CPIs) is of great importance for drug design and development, as genome-scale experimental validation of CPIs is not only time-consuming but also prohibitively expensive. With the availability of an increasing number of validated interactions, the performance of computational prediction approaches is severely impended by the lack of reliable negative CPI samples. A systematic method of screening reliable negative sample becomes critical to improving the performance of in silico prediction methods. Results: This article aims at building up a set of highly credible negative samples of CPIs via an in silico screening method. As most existing computational models assume that similar compounds are likely to interact with similar target proteins and achieve remarkable performance, it is rational to identify potential negative samples based on the converse negative proposition that the proteins dissimilar to every known/predicted target of a compound are not much likely to be targeted by the compound and vice versa. We integrated various resources, including chemical structures, chemical expression profiles and side effects of compounds, amino acid sequences, protein–protein interaction network and functional annotations of proteins, into a systematic screening framework. We first tested the screened negative samples on six classical classifiers, and all these classifiers achieved remarkably higher performance on our negative samples than on randomly generated negative samples for both human and Caenorhabditis elegans. We then verified the negative samples on three existing prediction models, including bipartite local model, Gaussian kernel profile and Bayesian matrix factorization, and found that the performances of these models are also significantly improved on the screened negative samples. Moreover, we validated the screened negative samples on a drug bioactivity dataset. Finally, we derived two sets of new interactions by training an support vector machine classifier on the positive interactions annotated in DrugBank and our screened negative interactions. The screened negative samples and the predicted interactions provide the research community with a useful resource for identifying new drug targets and a helpful supplement to the current curated compound–protein databases. Availability: Supplementary files are available at: http://admis.fudan.edu.cn/negative-cpi/. Contact: sgzhou@fudan.edu.cn Supplementary Information: Supplementary data are available at Bioinformatics online.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
无情的mm完成签到 ,获得积分10
刚刚
老迟到的幼枫完成签到,获得积分10
2秒前
肉片牛帅帅完成签到,获得积分10
7秒前
跳跃仙人掌完成签到 ,获得积分0
7秒前
wqy完成签到 ,获得积分10
8秒前
能干的夏瑶完成签到 ,获得积分10
9秒前
爱笑半雪完成签到,获得积分10
9秒前
hiten完成签到,获得积分10
10秒前
蓝天完成签到,获得积分10
14秒前
SONGYEZI完成签到,获得积分10
16秒前
祭途完成签到,获得积分10
16秒前
小陈完成签到,获得积分10
16秒前
王醉山完成签到,获得积分10
18秒前
LIUJIE完成签到,获得积分10
20秒前
少年旭完成签到,获得积分10
20秒前
guo完成签到 ,获得积分10
23秒前
绿袖子完成签到,获得积分10
25秒前
妮妮完成签到,获得积分10
25秒前
明理的乐儿完成签到 ,获得积分10
25秒前
NorthWang完成签到,获得积分10
27秒前
大曼曼曼曼完成签到,获得积分10
27秒前
狼牧羊城完成签到,获得积分10
31秒前
粗犷的沛容完成签到,获得积分10
31秒前
火星上友易完成签到,获得积分10
32秒前
ANESTHESIA_XY完成签到 ,获得积分10
32秒前
36秒前
司藤完成签到 ,获得积分10
37秒前
从容芮完成签到,获得积分0
37秒前
酷酷的碳完成签到 ,获得积分10
38秒前
39秒前
091完成签到 ,获得积分10
40秒前
Leach完成签到 ,获得积分10
42秒前
jichao完成签到,获得积分10
43秒前
拓跋幻枫完成签到,获得积分10
43秒前
fwz完成签到,获得积分10
43秒前
zcg完成签到,获得积分10
44秒前
派大星完成签到,获得积分10
45秒前
郭义敏完成签到,获得积分0
46秒前
Airhug完成签到 ,获得积分10
48秒前
畅快的小虾米完成签到,获得积分10
48秒前
高分求助中
Lire en communiste 1000
Ore genesis in the Zambian Copperbelt with particular reference to the northern sector of the Chambishi basin 800
Becoming: An Introduction to Jung's Concept of Individuation 600
Communist propaganda: a fact book, 1957-1958 500
Briefe aus Shanghai 1946‒1952 (Dokumente eines Kulturschocks) 500
A new species of Coccus (Homoptera: Coccoidea) from Malawi 500
A new species of Velataspis (Hemiptera Coccoidea Diaspididae) from tea in Assam 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3167238
求助须知:如何正确求助?哪些是违规求助? 2818724
关于积分的说明 7922096
捐赠科研通 2478513
什么是DOI,文献DOI怎么找? 1320350
科研通“疑难数据库(出版商)”最低求助积分说明 632776
版权声明 602443