聚类分析
兰德指数
计算机科学
可微函数
层次聚类
数据挖掘
度量(数据仓库)
水准点(测量)
相关性
相似性度量
稳健性(进化)
差异(会计)
源代码
计算生物学
模式识别(心理学)
人工智能
算法
数学
基因
生物
会计
数学分析
业务
操作系统
生物化学
地理
大地测量学
几何学
作者
Hao Jiang,Lydia L. Sohn,Haiyan Huang,Luonan Chen
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2018-05-16
卷期号:34 (21): 3684-3694
被引量:71
标识
DOI:10.1093/bioinformatics/bty390
摘要
Abstract Motivation The rapid advancement of single cell technologies has shed new light on the complex mechanisms of cellular heterogeneity. Identification of intercellular transcriptomic heterogeneity is one of the most critical tasks in single-cell RNA-sequencing studies. Results We propose a new cell similarity measure based on cell-pair differentiability correlation, which is derived from gene differential pattern among all cell pairs. Through plugging into the framework of hierarchical clustering with this new measure, we further develop a variance analysis based clustering algorithm ‘Corr’ that can determine cluster number automatically and identify cell types accurately. The robustness and superiority of the proposed algorithm are compared with representative algorithms: shared nearest neighbor (SNN)-Cliq and several other state-of-the-art clustering methods, on many benchmark or real single cell RNA-sequencing datasets in terms of both internal criteria (clustering number and accuracy) and external criteria (purity, adjusted rand index, F1-measure). Moreover, differentiability vector with our new measure provides a new means in identifying potential biomarkers from cancer related single cell datasets even with strong noise. Prognosis analyses from independent datasets of cancers confirmed the effectiveness of our ‘Corr’ method. Availability and implementation The source code (Matlab) is available at http://sysbio.sibcb.ac.cn/cb/chenlab/soft/Corr--SourceCodes.zip Supplementary information Supplementary data are available at Bioinformatics online.
科研通智能强力驱动
Strongly Powered by AbleSci AI