DeepDigest: Prediction of Protein Proteolytic Digestion with Deep Learning

蛋白酵素 化学 蛋白质组学 胰蛋白酶 糜蛋白酶 蛋白水解酶 计算生物学 鸟枪蛋白质组学 劈理(地质) 生物化学 人工智能 机器学习 计算机科学 生物 古生物学 基因 断裂(地质)
作者
Jinghan Yang,Zhiqiang Gao,Xiuhan Ren,Jie Sheng,Ping Xu,Cheng Chang,Yan Fu
出处
期刊:Analytical Chemistry [American Chemical Society]
卷期号:93 (15): 6094-6103 被引量:40
标识
DOI:10.1021/acs.analchem.0c04704
摘要

Proteolytic digestion of proteins by one or more proteases is a key step in shotgun proteomics, in which the proteolytic products, i.e., peptides, are taken as the surrogates of their parent proteins for further qualitative or quantitative analysis. The proteases generally cleave proteins at specific amino acid residue sites, but digestion is hardly complete (wide existence of missed cleavage sites). Therefore, it would be of great help to improve the prior experimental design and the posterior data analysis if the digestion behaviors of proteases can be accurately modeled and predicted. At present, systematic studies about the commonly used proteases in proteomics are insufficient, and there is a lack of easy-to-use tools to predict the cleavage sites of different proteases. Here, we propose a novel sequence-based deep learning algorithm-DeepDigest, which integrates convolutional neural networks and long short-term memory networks for protein digestion prediction. DeepDigest can predict the cleavage probability of each potential cleavage site on the protein sequences for eight popular proteases including trypsin, ArgC, chymotrypsin, GluC, LysC, AspN, LysN, and LysargiNase. We compared DeepDigest with three traditional machine learning algorithms, i.e., logistic regression, random forest, and support vector machine. On the eight training data sets, the 10-fold cross-validation accuracies (AUCs) of DeepDigest were 0.956-0.982, significantly higher than those of the three traditional algorithms. On the 11 independent test data sets, DeepDigest achieved AUCs between 0.849 and 0.978, outperforming the other traditional algorithms in most cases. Transfer learning then further improved the prediction accuracy. Besides, some interesting characteristics of different proteases were revealed and discussed. Ultimately, as an application, we used DeepDigest to predict the digestibilities of peptides and demonstrated that peptide digestibility is an informative new feature to discriminate between correct and incorrect peptide identifications.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
雪下卧眠完成签到,获得积分10
1秒前
达达完成签到,获得积分10
1秒前
过时的糖豆关注了科研通微信公众号
1秒前
量子星尘发布了新的文献求助10
2秒前
牛tongxue完成签到,获得积分10
2秒前
2秒前
2秒前
人可完成签到,获得积分10
2秒前
搜集达人应助欣慰乐松采纳,获得10
3秒前
3秒前
plain发布了新的文献求助10
3秒前
3秒前
2531发布了新的文献求助10
3秒前
我是老大应助砖砖采纳,获得10
3秒前
blueblue发布了新的文献求助10
3秒前
hkh发布了新的文献求助10
4秒前
Wuyiqin完成签到,获得积分10
4秒前
pan完成签到,获得积分10
4秒前
sssss发布了新的文献求助30
4秒前
bastien完成签到 ,获得积分10
5秒前
秦磊完成签到,获得积分10
5秒前
幽默的煎饼完成签到,获得积分10
6秒前
可耐的如萱完成签到 ,获得积分10
6秒前
浮游应助zrus116采纳,获得10
6秒前
实验顺利发布了新的文献求助10
7秒前
七龙珠完成签到,获得积分10
7秒前
哎呦喂完成签到,获得积分10
8秒前
Loong发布了新的文献求助20
8秒前
8秒前
明理的天蓝完成签到,获得积分20
8秒前
qiang完成签到,获得积分10
9秒前
长安完成签到 ,获得积分10
9秒前
XO完成签到,获得积分10
9秒前
ldl完成签到,获得积分10
9秒前
majf发布了新的文献求助10
10秒前
乐乐应助391X小king采纳,获得10
10秒前
Evan_zhu发布了新的文献求助10
10秒前
111关闭了111文献求助
10秒前
liupangzi完成签到,获得积分10
11秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Encyclopedia of Reproduction Third Edition 3000
Comprehensive Methanol Science Production, Applications, and Emerging Technologies 2000
From Victimization to Aggression 1000
化妆品原料学 1000
小学科学课程与教学 500
Study and Interlaboratory Validation of Simultaneous LC-MS/MS Method for Food Allergens Using Model Processed Foods 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5645458
求助须知:如何正确求助?哪些是违规求助? 4768941
关于积分的说明 15029289
捐赠科研通 4804094
什么是DOI,文献DOI怎么找? 2568703
邀请新用户注册赠送积分活动 1525977
关于科研通互助平台的介绍 1485604