计算机科学
相似性(几何)
人工智能
轮廓
数据挖掘
维数之咒
传感器融合
光谱聚类
模式识别(心理学)
特征(语言学)
聚类分析
机器学习
图像(数学)
语言学
哲学
作者
Shuhui Liu,Xuequn Shang
标识
DOI:10.1007/978-3-319-94968-0_11
摘要
Recent breakthroughs in biologic sequencing technologies have cost-effectively yielded diverse types of observations. Integrative analysis of multiple platform cancer data, which is capable of revealing intrinsic characteristics of a biological process, has become an attractive research route on cancer subtypes discovery. Most machine learning based methods need represent each input data in unified space, losing certain important features or resulting in various noises in some data types. Furthermore, many network based data integration methods treat each type data independently, leading to a lot of inconsistent conclusions. Subsequently, similarity network fusion (SNF) was developed to deal with such questions. However, Euclidean distance metrics employed in SNF suffers curse of dimensionality and thus gives rise to poor results. To this end, we propose a new integrated method, dubbed hierarchical similarity network (HSNF), to learn a fused discriminating patient similarity network. HSNF randomly samples sub-features from different input data to construct multiple input similarity matrixes used as a basic of fusion so that diverse similarity matrixes are generated by multiple random sampling. Then we design a hierarchical fusion framework to make full use of the complementariness of diverse similarity networks from different feature modalities. Finally, based on the final fused similarity matrix, spectral clustering was used to discover cancer subtypes. Experimental results on five public cancer datasets manifest that HSNF can discover significantly different subtypes and can consistently outperform the-state-of-the-art in terms of silhouette, and p-value of survival analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI