聚类分析
数据挖掘
人口
计算机科学
相似性(几何)
人工智能
秩(图论)
机器学习
数学
组合数学
图像(数学)
社会学
人口学
作者
Yunpei Xu,Hong‐Dong Li,Yi Pan,Feng Luo,Fang‐Xiang Wu,Jianxin Wang
标识
DOI:10.1109/tcbb.2019.2931582
摘要
Single-cell RNA sequencing (scRNA-seq) technology provides quantitative gene expression profiles at single-cell resolution. As a result, researchers have established new ways to explore cell population heterogeneity and genetic variability of cells. One of the current research directions for scRNA-seq data is to identify different cell types accurately through unsupervised clustering methods. However, scRNA-seq data analysis is challenging because of their high noise level, high dimensionality and sparsity. Moreover, the impact of multiple latent factors on gene expression heterogeneity and on the ability to accurately identify cell types remains unclear. How to overcome these challenges to reveal the biological difference between cell types has become the key to analyze scRNA-seq data. For these reasons, the unsupervised learning for cell population discovery based on scRNA-seq data analysis has become an important research area. A cell similarity assessment method plays a significant role in cell clustering. Here, we present BioRank, a new cell similarity assessment method based on annotated gene sets and gene ranks. To evaluate the performances, we cluster cells by two classical clustering algorithms based on the similarity between cells obtained by BioRank. In addition, BioRank can be used by any clustering algorithm that requires a similarity matrix. Applying BioRank to 12 public scRNA-seq datasets, we show that it is better than or at least as well as several popular similarity assessment methods for single cell clustering.
科研通智能强力驱动
Strongly Powered by AbleSci AI