插补(统计学)
辍学(神经网络)
后验概率
贝叶斯概率
RNA序列
计算机科学
马尔科夫蒙特卡洛
聚类分析
缺少数据
计算生物学
生物
基因
数据挖掘
基因表达
人工智能
遗传学
转录组
机器学习
作者
Siqi Chen,Ruiqing Zheng,Luyi Tian,Fang‐Xiang Wu,Min Li
出处
期刊:Methods
[Elsevier]
日期:2023-08-01
卷期号:216: 21-38
被引量:3
标识
DOI:10.1016/j.ymeth.2023.06.004
摘要
Single-cell RNA-sequencing (scRNA-seq) data suffer from a lot of zeros. Such dropout events impede the downstream data analyses. We propose BayesImpute to infer and impute dropouts from the scRNA-seq data. Using the expression rate and coefficient of variation of the genes within the cell subpopulation, BayesImpute first determines likely dropouts, and then constructs the posterior distribution for each gene and uses the posterior mean to impute dropout values. Some simulated and real experiments show that BayesImpute can effectively identify dropout events and reduce the introduction of false positive signals. Additionally, BayesImpute successfully recovers the true expression levels of missing values, restores the gene-to-gene and cell-to-cell correlation coefficient, and maintains the biological information in bulk RNA-seq data. Furthermore, BayesImpute boosts the clustering and visualization of cell subpopulations and improves the identification of differentially expressed genes. We further demonstrate that, in comparison to other statistical-based imputation methods, BayesImpute is scalable and fast with minimal memory usage.
科研通智能强力驱动
Strongly Powered by AbleSci AI