表位
计算机科学
人工智能
T细胞受体
变压器
无监督学习
语言模型
经济短缺
机器学习
计算生物学
T细胞
生物
免疫系统
抗原
免疫学
电压
语言学
哲学
物理
量子力学
政府(语言学)
作者
Barthelemy Meynard-Piganeau,Christoph Feinauer,Martin Weigt,Aleksandra M. Walczak,Thierry Mora
标识
DOI:10.1073/pnas.2316401121
摘要
The accurate prediction of binding between T cell receptors (TCR) and their cognate epitopes is key to understanding the adaptive immune response and developing immunotherapies. Current methods face two significant limitations: the shortage of comprehensive high-quality data and the bias introduced by the selection of the negative training data commonly used in the supervised learning approaches. We propose a method, Transformer-based Unsupervised Language model for Interacting Peptides and T cell receptors (TULIP), that addresses both limitations by leveraging incomplete data and unsupervised learning and using the transformer architecture of language models. Our model is flexible and integrates all possible data sources, regardless of their quality or completeness. We demonstrate the existence of a bias introduced by the sampling procedure used in previous supervised approaches, emphasizing the need for an unsupervised approach. TULIP recognizes the specific TCRs binding an epitope, performing well on unseen epitopes. Our model outperforms state-of-the-art models and offers a promising direction for the development of more accurate TCR epitope recognition models.
科研通智能强力驱动
Strongly Powered by AbleSci AI