基因组
系统发育树
规范化(社会学)
机器学习
人工智能
计算机科学
寄主(生物学)
支持向量机
随机森林
图形
计算生物学
数据挖掘
生物
基因
理论计算机科学
遗传学
人类学
社会学
作者
Bojing Li,Duo Zhong,Jimei Qiao,Xingpeng Jiang
出处
期刊:Methods
[Elsevier]
日期:2022-05-01
卷期号:205: 11-17
被引量:1
标识
DOI:10.1016/j.ymeth.2022.05.007
摘要
Microorganisms play important roles in our lives especially on metabolism and diseases. Determining the probability of human suffering from specific diseases and the severity of the disease based on microbial genes is the crucial research for understanding the relationship between microbes and diseases. Previous could extract the topological information of phylogenetic trees and integrate them to metagenomic datasets, thus enable classifiers to learn more information in limited datasets and thus improve the performance of the models. In this paper, we proposed a GNPI model to better learn the structure of phylogenetic trees. GNPI maintained the original vector format of metagenomic datasets, while previous research had to change the input form to matrices. The vector-like form of the input data can be easily adopted in the baseline machine learning models and is available for deep learning models. The datasets processed with GNPI help enhance the accuracy of machine learning and deep learning models in three different datasets. GNPI is an interpretable data processing method for host phenotype prediction and other bioinformatics tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI