EPIP: MHC-I epitope prediction integrating mass spectrometry derived motifs and tissue-specific expression profiles

表位 人类白细胞抗原 计算生物学 生物 抗原 遗传学
作者
Wenhua Hu,Shijun Qiu,Y. Li,Xin Lin,Lei Zhang,Hongjun Xiang,Xin-Yu Han,S. Zhu,Long Qing Chen,Shu Li,Wei Li,Z. L. Ren,G. Y. Hou,Zhihong Lin,J. G. Lu,Gui-Rong Liu,Bing Li,L. James Lee
标识
DOI:10.1101/567081
摘要

Abstract Background Accurate prediction of epitopes presented by human leukocyte antigen (HLA) is crucial for personalized cancer immunotherapies targeting T cell epitopes. Mass spectrometry (MS) profiling of eluted HLA ligands, which provides high-throughput measurements of HLA associated peptides in vivo , can be used to faithfully model the presentation of epitopes on the cell surface. In addition, gene expression profiles measured by RNA-seq data in a specific cell/tissue type can significantly improve the performance of epitope presentation prediction. However, although large amount of high-quality MS data of HLA-bound peptides is being generated in recent years, few provide matching RNA-seq data, which makes incorporating gene expression into epitope prediction difficult. Methods We collected publicly available HLA peptidome and matching RNA-seq data of 34 cell lines derived from various sources. We built position score specific matrixes (PSSMs) for 21 HLA-I alleles based on these MS data, then used logistic regression (LR) to model the relationship among PSSM score, gene expression and peptide length to predict whether a peptide could be presented in each of the cell line. We further built a universal LR model, termed Epitope Presentation Integrated Prediction (EPIP), based on more than 180,000 unique HLA ligands collected from public sources and ~3,000 HLA ligands generated by ourselves, to predict epitope presentation for 66 common HLA-I alleles. Results When evaluating EPIP on large, independent HLA eluted ligand datasets, it performed substantially better than other popular methods, including MixMHCpred (v2.0), NetMHCpan (v4.0), and MHCflurry (v1.2.2), with an average 0.1% positive predictive value (PPV) of 52.01%, compared to 37.24%, 36.96%, 24.90% and 23.76% achieved by MixMHCpred, NetMHCpan-4.0 (EL), NetMHCpan-4.0 (BA) and MHCflurry, respectively. It is also comparable to EDGE, a recent deep learning-based model that is not publicly available, on predicting epitope presentation and selecting immunogenic cancer neoantigens. However, the simplicity and flexibility of EPIP makes it easier to be applied in diverse situations, and we demonstrated this by generating MS data for the HCC4006 cell line and adding the support of HLA-A*33:03 to EPIP. EPIP is publicly available as a web tool < http://epip.genomics.cn/ >. Conclusions we have developed an easy to use, publicly available epitope prediction tool, EPIP, that incorporates information from both MS and RNA-seq data, and demonstrated its superior performance over existing public methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Supreme完成签到,获得积分10
刚刚
充电宝应助xuulanni采纳,获得10
1秒前
qiu发布了新的文献求助10
2秒前
依旧发布了新的文献求助10
3秒前
hujingyan发布了新的文献求助10
6秒前
lll发布了新的文献求助50
7秒前
9秒前
梁超完成签到,获得积分10
9秒前
薄衫完成签到,获得积分10
12秒前
14秒前
15秒前
专注岚发布了新的文献求助10
16秒前
传奇3应助科研通管家采纳,获得10
17秒前
薄衫发布了新的文献求助10
17秒前
wanci应助科研通管家采纳,获得10
17秒前
Rita应助科研通管家采纳,获得10
17秒前
大模型应助科研通管家采纳,获得10
17秒前
爆米花应助科研通管家采纳,获得10
17秒前
17秒前
xiao142发布了新的文献求助10
18秒前
无限的寄真完成签到 ,获得积分10
19秒前
19秒前
20秒前
欢呼梨愁完成签到,获得积分10
21秒前
Ye发布了新的文献求助10
24秒前
小二郎应助成就的芷蕾采纳,获得10
24秒前
毕十三发布了新的文献求助10
24秒前
25秒前
烟花应助欢呼梨愁采纳,获得10
26秒前
FashionBoy应助小方采纳,获得10
26秒前
cv完成签到,获得积分10
26秒前
cmm0816发布了新的文献求助10
27秒前
29秒前
英俊的铭应助从容的夏瑶采纳,获得10
30秒前
Dylan完成签到,获得积分20
31秒前
31秒前
深情安青应助派大星采纳,获得10
33秒前
xiaolong发布了新的文献求助10
33秒前
33秒前
35秒前
高分求助中
Востребованный временем 2500
Agaricales of New Zealand 1: Pluteaceae - Entolomataceae 1040
지식생태학: 생태학, 죽은 지식을 깨우다 600
海南省蛇咬伤流行病学特征与预后影响因素分析 500
Neuromuscular and Electrodiagnostic Medicine Board Review 500
ランス多機能化技術による溶鋼脱ガス処理の高効率化の研究 500
Relativism, Conceptual Schemes, and Categorical Frameworks 500
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 纳米技术 内科学 物理 化学工程 计算机科学 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 电极
热门帖子
关注 科研通微信公众号,转发送积分 3462523
求助须知:如何正确求助?哪些是违规求助? 3056054
关于积分的说明 9050469
捐赠科研通 2745649
什么是DOI,文献DOI怎么找? 1506494
科研通“疑难数据库(出版商)”最低求助积分说明 696141
邀请新用户注册赠送积分活动 695674