A Feature Weighting-Assisted Approach for Cancer Subtypes Identification From Paired Expression Profiles

加权计算生物学鉴定（生物学）集合（抽象数据类型）计算机科学聚类分析相似性（几何）数据挖掘癌症特征（语言学）人工智能模式识别（心理学）生物医学遗传学图像（数学）放射科哲学植物程序设计语言语言学

作者

M. P. Singh,Sushmita Paul

出处

期刊：IEEE/ACM Transactions on Computational Biology and Bioinformatics [Institute of Electrical and Electronics Engineers]
日期：2022-05-01 卷期号：19 (3): 1403-1414 被引量：1

链接

nih.govdoi.org

标识

DOI：10.1109/tcbb.2020.3041723

摘要

Identification of cancer subtypes is critically important for understanding the heterogeneity present in tumors. Projects like The Cancer Genome Atlas (TCGA), have made available the data-sets containing expression profiles of multiple types of biomarkers across the same set of samples. Availability of these types of data-sets help in capturing heterogeneity underlying, complex biological processes and phenotypes. Further, by integrating information from multiple sources, homogeneous groups for cancer can be identified. However, there is a lack of computational approaches to identify histological subtypes among the patients suffering from different types of cancers. Assigning weight to the biomarkers prior to the integration of multiple information sources for the same set of samples can play an important role in cancer subtypes identification, which has not been explored previously. Sub-typing of cancers can help in analyzing shared molecular profiles between different histological subtypes of solid tumors. This can further help in designing appropriate therapies and treatments. A novel method for feature weighting based on robust regression fit is developed in this study. This method assigns a weight to every biomarker on the basis of variability present across the samples. Later, this weight is utilized to find similarity between patients individually from each of the information sources. In this study, the two information sources that have been utilized are miRNA and mRNA expression profiles across the same set of samples. Patient-similarity networks, that are generated from each of the expression profiles are then integrated using the approach of Similarity Network Fusion. Finally, Spectral clustering is applied on the fused network to identify similar groups of patients that represent a cancer subtype. To establish the efficiency of the proposed approach, it has been applied to three types of cancer data-sets and is also compared with the other existing methods.

求助该文献

A Feature Weighting-Assisted Approach for Cancer Subtypes Identification From Paired Expression Profiles

今日热心研友