计算机科学
蛋白质组学
嵌入
计算生物学
聚类分析
图形
蛋白质基因组学
蛋白质组
人工智能
基因组学
生物信息学
生物
基因组
理论计算机科学
基因
生物化学
作者
Wei Li,Fan Yang,Fang Wang,Yu Rong,Bingzhe Wu,Han Zhang,Jianhua Yao
标识
DOI:10.1101/2022.12.14.520366
摘要
Abstract The advance of single-cell proteomics sequencing technology sheds light on the research in revealing the protein-protein interactions, the post-translational modifications, and the proteoform dynamics of proteins in a cell. However, the uncertainty estimation for peptide quantification, data missingness, severe batch effects and high noise hinder the analysis of single-cell proteomic data. It is a significant challenge to solve this set of tangled problems together, where existing methods tailored for single-cell transcriptome do not address. Here, we proposed a novel versatile framework scPROTEIN, composed of peptide uncertainty estimation based on a multi-task heteroscedastic regression model and cell embedding learning based on graph contrastive learning designed for single-cell proteomic data analysis. scPROTEIN estimated the uncertainty of peptide quantification, denoised the protein data, removed batch effects and encoded single-cell proteomic-specific embeddings in a unified framework. We demonstrate that our method is efficient for cell clustering, batch correction, cell-type annotation and clinical analysis. Furthermore, our method can be easily plugged into single-cell resolved spatial proteomic data, laying the foundation for encoding spatial proteomic data for tumor microenvironment analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI