插补(统计学)
计算机科学
图形
生成模型
基因调控网络
生成对抗网络
RNA序列
缺少数据
数据挖掘
人工智能
基因
生成语法
机器学习
理论计算机科学
深度学习
生物
基因表达
遗传学
转录组
作者
Zimo Huang,Jun Wang,Xudong Lü,Azlan Mohd Zain,Guoxian Yu
摘要
Single-cell RNA sequencing (scRNA-seq) data are typically with a large number of missing values, which often results in the loss of critical gene signaling information and seriously limit the downstream analysis. Deep learning-based imputation methods often can better handle scRNA-seq data than shallow ones, but most of them do not consider the inherent relations between genes, and the expression of a gene is often regulated by other genes. Therefore, it is essential to impute scRNA-seq data by considering the regional gene-to-gene relations. We propose a novel model (named scGGAN) to impute scRNA-seq data that learns the gene-to-gene relations by Graph Convolutional Networks (GCN) and global scRNA-seq data distribution by Generative Adversarial Networks (GAN). scGGAN first leverages single-cell and bulk genomics data to explore inherent relations between genes and builds a more compact gene relation network to jointly capture the homogeneous and heterogeneous information. Then, it constructs a GCN-based GAN model to integrate the scRNA-seq, gene sequencing data and gene relation network for generating scRNA-seq data, and trains the model through adversarial learning. Finally, it utilizes data generated by the trained GCN-based GAN model to impute scRNA-seq data. Experiments on simulated and real scRNA-seq datasets show that scGGAN can effectively identify dropout events, recover the biologically meaningful expressions, determine subcellular states and types, improve the differential expression analysis and temporal dynamics analysis. Ablation experiments confirm that both the gene relation network and gene sequence data help the imputation of scRNA-seq data.
科研通智能强力驱动
Strongly Powered by AbleSci AI