随机森林
水质
决策树
机器学习
计算机科学
钥匙(锁)
集成学习
集合预报
人工智能
数据挖掘
预测建模
生态学
计算机安全
生物
作者
Kangyang Chen,Hexia Chen,Chuanlong Zhou,Yichao Huang,Xiangyang Qi,Rongxi Shen,Fengrui Liu,Min Zuo,Xinyi Zou,Jinfeng Wang,Yan Zhang,Da Chen,Xingguo Chen,Yongfeng Deng,Hongqiang Ren
出处
期刊:Water Research
[Elsevier]
日期:2020-03-01
卷期号:171: 115454-115454
被引量:296
标识
DOI:10.1016/j.watres.2019.115454
摘要
The water quality prediction performance of machine learning models may be not only dependent on the models, but also dependent on the parameters in data set chosen for training the learning models. Moreover, the key water parameters should also be identified by the learning models, in order to further reduce prediction costs and improve prediction efficiency. Here we endeavored for the first time to compare the water quality prediction performance of 10 learning models (7 traditional and 3 ensemble models) using big data (33,612 observations) from the major rivers and lakes in China from 2012 to 2018, based on the precision, recall, F1-score, weighted F1-score, and explore the potential key water parameters for future model prediction. Our results showed that the bigger data could improve the performance of learning models in prediction of water quality. Compared to other 7 models, decision tree (DT), random forest (RF) and deep cascade forest (DCF) trained by data sets of pH, DO, CODMn, and NH3-N had significantly better performance in prediction of all 6 Levels of water quality recommended by Chinese government. Moreover, two key water parameter sets (DO, CODMn, and NH3-N; CODMn, and NH3-N) were identified and validated by DT, RF and DCF to be high specificities for perdition water quality. Therefore, DT, RF and DCF with selected key water parameters could be prioritized for future water quality monitoring and providing timely water quality warning.
科研通智能强力驱动
Strongly Powered by AbleSci AI