EPIP: MHC-I epitope prediction integrating mass spectrometry derived motifs and tissue-specific expression profiles

表位 人类白细胞抗原 计算生物学 生物 抗原 遗传学
作者
Wenhua Hu,Shijun Qiu,Y. Li,Xin Lin,Lei Zhang,Hongjun Xiang,Xin-Yu Han,S. Zhu,Long Qing Chen,Shu Li,Wei Li,Z. L. Ren,G. Y. Hou,Zhihong Lin,J. G. Lu,Gui-Rong Liu,Bing Li,L. James Lee
标识
DOI:10.1101/567081
摘要

Abstract Background Accurate prediction of epitopes presented by human leukocyte antigen (HLA) is crucial for personalized cancer immunotherapies targeting T cell epitopes. Mass spectrometry (MS) profiling of eluted HLA ligands, which provides high-throughput measurements of HLA associated peptides in vivo , can be used to faithfully model the presentation of epitopes on the cell surface. In addition, gene expression profiles measured by RNA-seq data in a specific cell/tissue type can significantly improve the performance of epitope presentation prediction. However, although large amount of high-quality MS data of HLA-bound peptides is being generated in recent years, few provide matching RNA-seq data, which makes incorporating gene expression into epitope prediction difficult. Methods We collected publicly available HLA peptidome and matching RNA-seq data of 34 cell lines derived from various sources. We built position score specific matrixes (PSSMs) for 21 HLA-I alleles based on these MS data, then used logistic regression (LR) to model the relationship among PSSM score, gene expression and peptide length to predict whether a peptide could be presented in each of the cell line. We further built a universal LR model, termed Epitope Presentation Integrated Prediction (EPIP), based on more than 180,000 unique HLA ligands collected from public sources and ~3,000 HLA ligands generated by ourselves, to predict epitope presentation for 66 common HLA-I alleles. Results When evaluating EPIP on large, independent HLA eluted ligand datasets, it performed substantially better than other popular methods, including MixMHCpred (v2.0), NetMHCpan (v4.0), and MHCflurry (v1.2.2), with an average 0.1% positive predictive value (PPV) of 52.01%, compared to 37.24%, 36.96%, 24.90% and 23.76% achieved by MixMHCpred, NetMHCpan-4.0 (EL), NetMHCpan-4.0 (BA) and MHCflurry, respectively. It is also comparable to EDGE, a recent deep learning-based model that is not publicly available, on predicting epitope presentation and selecting immunogenic cancer neoantigens. However, the simplicity and flexibility of EPIP makes it easier to be applied in diverse situations, and we demonstrated this by generating MS data for the HCC4006 cell line and adding the support of HLA-A*33:03 to EPIP. EPIP is publicly available as a web tool < http://epip.genomics.cn/ >. Conclusions we have developed an easy to use, publicly available epitope prediction tool, EPIP, that incorporates information from both MS and RNA-seq data, and demonstrated its superior performance over existing public methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
grace135发布了新的文献求助10
2秒前
sxy完成签到,获得积分10
5秒前
5秒前
贝贝贝完成签到,获得积分10
6秒前
sxy发布了新的文献求助10
7秒前
7秒前
在水一方应助淡淡紫山采纳,获得10
7秒前
8秒前
科小白发布了新的文献求助10
9秒前
10秒前
幸福的手套完成签到 ,获得积分10
11秒前
泡泡啰叽发布了新的文献求助10
11秒前
11秒前
hmhu发布了新的文献求助10
12秒前
wang_qi发布了新的文献求助10
12秒前
飞奔向你完成签到,获得积分10
15秒前
醍醐不醒发布了新的文献求助10
15秒前
16秒前
sl完成签到,获得积分20
17秒前
18秒前
畅快的刚完成签到 ,获得积分10
19秒前
还行吧完成签到 ,获得积分10
19秒前
20秒前
20秒前
lycoris发布了新的文献求助10
23秒前
yang发布了新的文献求助10
23秒前
科小白完成签到,获得积分10
23秒前
任性汉堡发布了新的文献求助10
24秒前
25秒前
26秒前
26秒前
小蘑菇应助Qwe采纳,获得10
26秒前
SYLH应助zhaoyali采纳,获得10
27秒前
28秒前
28秒前
1762571452完成签到,获得积分10
28秒前
所所应助潘宋采纳,获得10
28秒前
胡杰完成签到,获得积分10
29秒前
Sjk关注了科研通微信公众号
29秒前
高分求助中
A new approach to the extrapolation of accelerated life test data 1000
Cognitive Neuroscience: The Biology of the Mind 1000
Technical Brochure TB 814: LPIT applications in HV gas insulated switchgear 1000
Immigrant Incorporation in East Asian Democracies 600
Nucleophilic substitution in azasydnone-modified dinitroanisoles 500
不知道标题是什么 500
A Preliminary Study on Correlation Between Independent Components of Facial Thermal Images and Subjective Assessment of Chronic Stress 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3966681
求助须知:如何正确求助?哪些是违规求助? 3512151
关于积分的说明 11161937
捐赠科研通 3246996
什么是DOI,文献DOI怎么找? 1793640
邀请新用户注册赠送积分活动 874520
科研通“疑难数据库(出版商)”最低求助积分说明 804421