核(代数)
推论
鉴定(生物学)
计算机科学
多核学习
集合(抽象数据类型)
人工智能
相似性(几何)
机器学习
秩(图论)
核方法
字符串内核
数据挖掘
计算生物学
理论计算机科学
支持向量机
径向基函数核
数学
生物
植物
组合数学
图像(数学)
程序设计语言
作者
Ekta Shah,Pradipta Maji
标识
DOI:10.1109/tcbb.2023.3247033
摘要
Gene expression data sets and protein-protein interaction (PPI) networks are two heterogeneous data sources that have been extensively studied, due to their ability to capture the co-expression patterns among genes and their topological connections. Although they depict different traits of the data, both of them tend to group co-functional genes together. This phenomenon agrees with the basic assumption of multi-view kernel learning, according to which different views of the data contain a similar inherent cluster structure. Based on this inference, a new multi-view kernel learning based disease gene identification algorithm, termed as DiGId, is put forward. A novel multi-view kernel learning approach is proposed that aims to learn a consensus kernel, which efficiently captures the heterogeneous information of individual views as well as depicts the underlying inherent cluster structure. Some low-rank constraints are imposed on the learned multi-view kernel, so that it can effectively be partitioned into k or fewer clusters. The learned joint cluster structure is used to curate a set of potential disease genes. Moreover, a novel approach is put forward to quantify the importance of each view. In order to demonstrate the effectiveness of the proposed approach in capturing the relevant information depicted by individual views, an extensive analysis is performed on four different cancer-related gene expression data sets and PPI network, considering different similarity measures.
科研通智能强力驱动
Strongly Powered by AbleSci AI