Comprehensive prediction and analysis of human protein essentiality based on a pre-trained protein large language model

计算机科学 自然语言处理 人工智能
作者
B. S. Kang,Rui Fan,Chunmei Cui,Qinghua Cui
标识
DOI:10.1101/2024.03.26.586900
摘要

Abstract Human essential genes and their protein products are indispensable for the viability and development of the individuals. Thus, it is quite important to decipher the essential proteins and up to now numerous computational methods have been developed for the above purpose. However, the current methods failed to comprehensively measure human protein essentiality at levels of humans, human cell lines, and mice orthologues. For doing so, here we developed Protein Importance Calculator (PIC), a sequence-based deep learning model, which was built by fine-tuning a pre-trained protein language model. As a result, PIC outperformed existing methods by increasing 5.13%-12.10% AUROC for predicting essential proteins at human cell-line level. In addition, it improved an average of 9.64% AUROC on 323 human cell lines compared to the only existing cell line-specific method, DeepCellEss. Moreover, we defined Protein Essential Score (PES) to quantify protein essentiality based on PIC and confirmed its power of measuring human protein essentiality and functional divergence across the above three levels. Finally, we successfully used PES to identify prognostic biomarkers of breast cancer and at the first time to quantify the essentiality of 617462 human microproteins. Key Points PIC outperformed existing computational methods for predicting essential proteins. PIC could comprehensively predict human protein essentiality at levels of human, human cell lines and mice orthologues at the same time. PES could serve as a potential metric to quantify the essentiality of both human proteins and human microproteins.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
4秒前
星海种花完成签到 ,获得积分10
7秒前
swnucquwd完成签到 ,获得积分10
9秒前
量子星尘发布了新的文献求助10
15秒前
曹文鹏完成签到 ,获得积分10
28秒前
黑眼圈完成签到 ,获得积分10
29秒前
脑洞疼应助漂亮的忆文采纳,获得10
30秒前
月亮与六便士完成签到 ,获得积分10
30秒前
woodword完成签到,获得积分10
31秒前
啊哈哈哈完成签到 ,获得积分10
34秒前
黄毛虎完成签到 ,获得积分0
36秒前
i2stay完成签到,获得积分10
39秒前
陶醉的翠霜完成签到 ,获得积分10
39秒前
Harlotte完成签到 ,获得积分10
42秒前
追梦完成签到,获得积分10
46秒前
CoCo完成签到 ,获得积分10
46秒前
Lyw完成签到 ,获得积分10
50秒前
V_I_G完成签到 ,获得积分10
50秒前
量子星尘发布了新的文献求助10
52秒前
yinyin完成签到 ,获得积分10
52秒前
54秒前
Yes0419完成签到,获得积分10
55秒前
ken131完成签到 ,获得积分10
1分钟前
ceeray23发布了新的文献求助20
1分钟前
王QQ完成签到 ,获得积分10
1分钟前
桐桐应助cm采纳,获得10
1分钟前
詹姆斯哈登完成签到,获得积分10
1分钟前
无辜的行云完成签到 ,获得积分0
1分钟前
绿袖子完成签到,获得积分10
1分钟前
sln完成签到,获得积分10
1分钟前
1分钟前
even完成签到 ,获得积分10
1分钟前
1分钟前
wmuzhao完成签到,获得积分10
1分钟前
1分钟前
llhh2024完成签到,获得积分10
1分钟前
cm发布了新的文献求助10
1分钟前
Julien完成签到 ,获得积分10
1分钟前
wanci应助ceeray23采纳,获得20
1分钟前
fanfan完成签到 ,获得积分10
1分钟前
高分求助中
【提示信息,请勿应助】关于scihub 10000
The Mother of All Tableaux: Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 3000
Social Research Methods (4th Edition) by Maggie Walter (2019) 2390
A new approach to the extrapolation of accelerated life test data 1000
北师大毕业论文 基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 390
Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370
Robot-supported joining of reinforcement textiles with one-sided sewing heads 360
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4008763
求助须知:如何正确求助?哪些是违规求助? 3548409
关于积分的说明 11298823
捐赠科研通 3283064
什么是DOI,文献DOI怎么找? 1810290
邀请新用户注册赠送积分活动 886000
科研通“疑难数据库(出版商)”最低求助积分说明 811220