聚类分析
计算机科学
组学
图形
可扩展性
嵌入
计算生物学
机器学习
数据挖掘
人工智能
生物信息学
生物
理论计算机科学
数据库
作者
Bingjun Li,Sheida Nabavi
标识
DOI:10.1109/bibm58861.2023.10385267
摘要
Recent advancements in single-cell multiomics sequencing create new research opportunities but also pose challenges, particularly in cell clustering. One major challenge is feature fusion. Early fusion models are robust but ignore the unique distributions of omics and cannot handle various omic dimensions. Most current clustering methods use late fusion, employing independent encoders for each omic. However, the extracted omic features belong to different latent spaces, leading to difficulties in aligning omics. Additionally, current cell clustering methods do not incorporate prior biological knowledge, such as interactions within and across omics, which has been shown plays a key role in defining cell types.To address these shortcomings, we propose a novel, scalable, end-to-end clustering method, called single-cell graph embedding multiomics cluster (scGEMOC). scGEMOC utilizes prior biological knowledge to represent inter- and intra-omics connections as a heterogeneous graph. It applies graph embedding to aggregate omics interaction data as a pseudo omic and employs contrastive learning for effectively aligning omics in the latent space. We evaluated scGEMOC on three public datasets against five state-of-the-art baseline models. scGEMOC achieves superior clustering performance compared to the baseline models on all datasets. An ablation study confirms the significant contribution of each component and identifies the most impactful one.
科研通智能强力驱动
Strongly Powered by AbleSci AI