支持向量机
人工智能
随机森林
计算机科学
二进制数
机器学习
二元分类
人工神经网络
估计员
预测建模
数据挖掘
模式识别(心理学)
数学
统计
算术
作者
Nuohan Xu,Zhenyan Zhang,Yechao Shen,Qi Zhang,Zhen Liu,Yitian Yu,Yan Wang,Chaotang Lei,Mingjing Ke,Danyan Qiu,Tao Lu,Yi‐Ling Chen,Juntao Xiong,Haifeng Qian
标识
DOI:10.1016/j.scitotenv.2022.155807
摘要
The development of machine learning and deep learning provided solutions for predicting microbiota response on environmental change based on microbial high-throughput sequencing. However, there were few studies specifically clarifying the performance and practical of two types of binary classification models to find a better algorithm for the microbiota data analysis. Here, for the first time, we evaluated the performance, accuracy and running time of the binary classification models built by three machine learning methods - random forest (RF), support vector machine (SVM), logistic regression (LR), and one deep learning method - back propagation neural network (BPNN). The built models were based on the microbiota datasets that removed low-quality variables and solved the class imbalance problem. Additionally, we optimized the models by tuning. Our study demonstrated that dataset pre-processing was a necessary process for model construction. Among these 4 binary classification models, BPNN and RF were the most suitable methods for constructing microbiota binary classification models. Using these 4 models to predict multiple microbial datasets, BPNN showed the highest accuracy and the most robust performance, while the RF method was ranked second. We also constructed the optimal models by adjusting the epochs of BPNN and the n_estimators of RF for six times. The evaluation related to performances of models provided a road map for the application of artificial intelligence to assess microbial ecology.
科研通智能强力驱动
Strongly Powered by AbleSci AI