背景(考古学)
转录组
计算生物学
注释
生物
计算机科学
人工智能
遗传学
基因表达
基因
古生物学
作者
Ruoqiao Chen,Jiayu Zhou,Bin Chen
出处
期刊:Cell systems
[Elsevier]
日期:2024-09-01
卷期号:15 (9): 869-884.e6
标识
DOI:10.1016/j.cels.2024.08.006
摘要
Cell surface proteins serve as primary drug targets and cell identity markers. Techniques such as CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) have enabled the simultaneous quantification of surface protein abundance and transcript expression within individual cells. The published data have been utilized to train machine learning models for predicting surface protein abundance solely from transcript expression. However, the small scale of proteins predicted and the poor generalization ability of these computational approaches across diverse contexts (e.g., different tissues/disease states) impede their widespread adoption. Here, we propose SPIDER (surface protein prediction using deep ensembles from single-cell RNA sequencing), a context-agnostic zero-shot deep ensemble model, which enables large-scale protein abundance prediction and generalizes better to various contexts. Comprehensive benchmarking shows that SPIDER outperforms other state-of-the-art methods. Using the predicted surface abundance of >2,500 proteins from single-cell transcriptomes, we demonstrate the broad applications of SPIDER, including cell type annotation, biomarker/target identification, and cell-cell interaction analysis in hepatocellular carcinoma and colorectal cancer. A record of this paper's transparent peer review process is included in the supplemental information.
科研通智能强力驱动
Strongly Powered by AbleSci AI