自编码
蛋白质基因组学
浆液性卵巢癌
计算机科学
深度学习
Lasso(编程语言)
计算生物学
癌症
人工智能
生物信息学
生物
基因
基因组学
卵巢癌
遗传学
万维网
基因组
作者
Huiqing Wang,Haolin Li,Jia-Le Han,Zhipeng Feng,Hongxia Deng,Xiao Han
标识
DOI:10.1016/j.compbiolchem.2023.107906
摘要
High-grade serous ovarian cancer (HGSOC) is a type of ovarian cancer developed from serous tubal intraepithelial carcinoma. The intrinsic differences among molecular subtypes are closely associated with prognosis and pathological characteristics. At present, multi-omics data integration methods include early integration and late integration. Most existing HGSOC molecular subtypes classification methods are based on the early integration of multi-omics data. The mutual interference among multi-omics data is ignored, which affects the effectiveness of feature learning. High-dimensional multi-omics data contains genes unassociated with HGSOC molecular subtypes, resulting in redundant information, which is not conducive to model training. In this paper, we propose a multi-modal deep autoencoder learning method, MMDAE-HGSOC. MiRNA expression, DNA methylation, and copy number variation (CNV) are integrated with mRNA expression data to construct a multi-omics feature space. The multi-modal deep autoencoder network is used to learn the high-level feature representation of multi-omics data. The superposition LASSO (S-LASSO) regression algorithm is proposed to fully obtain the associated genes of HGSOC molecular subtypes. The experimental results show that MMDAE-HGSOC is superior to the existing classification methods. Finally, we analyze the enrichment gene ontology (GO) terms and biological pathways of these significant genes, which are discovered during the gene selection process.
科研通智能强力驱动
Strongly Powered by AbleSci AI