支持向量机
梯度升压
环境科学
均方误差
随机森林
日照时长
范畴变量
计算机科学
气象学
机器学习
统计
降水
数学
地理
作者
Junliang Fan,Xiukang Wang,Fucang Zhang,Xin Ma,Lifeng Wu
标识
DOI:10.1016/j.jclepro.2019.119264
摘要
Knowledge of diffuse horizontal solar radiation (Rd) on horizontal surfaces is a prerequisite for the design and optimization of active and passive solar energy systems such as the solar illumination system within a building, but it is unavailable in many worldwide locations and commonly predicted by readily available climatic variables. However, reliable prediction of Rd is difficult when lack of complete or previous climatic data at the target station. This study evaluated the performance of support vector machine (SVM) and four tree-based soft computing models, i.e. M5 model tree (M5Tree), random forest (RF), extreme gradient boosting (XGBoost) and gradient boosting with categorical features support (CatBoost), for prediction of daily horizontal Rd when using limited local (Scenario 1) and extrinsic (Scenarios 2 and 3) climatic data. Six input combinations of daily global solar radiation (Rs), sunshine hour (n), maximum/minimum temperature (Tmax/Tmin) and relative humidity (RH) during 1996–2015 at 15 weather stations across various climatic rons of China were considered. The results demonstrated that, when lack of Rs, the average root mean square error (RMSE) was considerably increased across China (42.4%) in Scenario 1, especially in the (sub)tropical monsoon ron (68.3%). SVM offered the best combination of prediction accuracy and generalization capability in all scenarios, followed by CatBoost. CatBoost produced the closest daily Rd estimates to SVM and satisfactory generalization capability. In Scenario 2, CatBoost and SVM models developed with climatic data from Beijing gave the overall best daily Rd estimates over the 15 stations, while models developed with data from 14 weather stations in Scenario 3 produced even better and steadier Rd estimates across China compared with those in Scenario 2. The average computational time of SVM (6.6 s) for a single sample was approximately 1.9 times that of CatBoost (3.5 s) in Scenarios 1 and 2, while the corresponding value (842.6 s) was approximately 33.9 times that of CatBoost (24.9 s) in Scenario 3. Comprehensively considering prediction accuracy, generalization capability and computational efficiency, CatBoost is highly recommended to develop general models for daily Rd prediction in various climatic rons of China, particularly when lack of previous local climatic data.
科研通智能强力驱动
Strongly Powered by AbleSci AI