非负矩阵分解
模式识别(心理学)
计算机科学
特征(语言学)
矩阵分解
人工智能
计算生物学
生物
语言学
量子力学
物理
哲学
特征向量
作者
Jin Deng,Weiming Zeng,Sizhe Luo,Wei Kong,Yuhu Shi,Ying Li,Hua Zhang
标识
DOI:10.1016/j.ins.2021.06.058
摘要
Integrative analysis of histopathology images and genomic data enables the discovery of potential biomarkers and multimodal association patterns. However, few studies have established effective association models for complex diseases, such as sarcoma, by combining histopathological images with multiple genetic variation data. Here, we present an integrative multiple genomic imaging framework called multi-dimensional constrained joint non-negative matrix factorization (MDJNMF) to identify modules related to lung metastasis of sarcomas based on sample-matched whole-solid image, DNA methylation, and copy number variation features. Three types of feature matrices were projected onto a common feature space, in which heterogeneous variables with large coefficients in the same projected direction form a common module. The correlation between image features and genetic variation features is used as network-regularized constraints to improve the module accuracy. Sparsity and orthogonal constraints are utilized to achieve the modular sparse solution. Multi-level analysis indicates that our method effectively discovers biologically functional modules associated with sarcoma or lung metastasis. The representative module reveals a significant correlation between image features and genetic variation features and excavates potential diagnostic biomarkers. In summary, the proposed method provides new clues for identifying association patterns and biomarkers using multiple types of data sources for other diseases.
科研通智能强力驱动
Strongly Powered by AbleSci AI