范畴变量
共线性
薄雾
统计
广义加性模型
变量
随机森林
数学
环境科学
计量经济学
地理
气象学
计算机科学
机器学习
作者
Vera Ling Hui Phung,Kazutaka Oka,Yasuaki Hijioka,Kayo Ueda,Mazrura Sahani,Wan Rozita Wan Mahiyuddin
标识
DOI:10.1016/j.scitotenv.2022.157312
摘要
Environmental factors have been associated with adverse health effects in epidemiological studies. The main exposure variable is usually determined via prior knowledge or statistical methods. It may be challenging when evidence is scarce to support prior knowledge, or to address collinearity issues using statistical methods. This study aimed to investigate the importance level of environmental variables for the under-five mortality in Malaysia via random forest approach.We applied a conditional permutation importance via a random forest (CPI-RF) approach to evaluate the relative importance of the weather- and air pollution-related environmental factors on daily under-five mortality in Malaysia. This study spanned from January 1, 2014 to December 31, 2016. In data preparation, deviation mortality counts were derived through a generalized additive model, adjusting for long-term trend and seasonality. Analyses were conducted considering mortality causes (all-cause, natural-cause, or external-cause) and data structures (continuous, categorical, or all types [i.e., include all variables of continuous type and all variables of categorical type]). The main analysis comprised of two stages. In Stage 1, Boruta selection was applied for preliminary screening to remove highly unimportant variables. In Stage 2, the retained variables from Boruta were used in the CPI-RF analysis. The final importance value was obtained as an average value from a 10-fold cross-validation.Some heat-related variables (maximum temperature, heat wave), temperature variability, and haze-related variables (PM10, PM10-derived haze index, PM10- and fire-derived haze index, fire hotspot) were among the prominent variables associated with under-five mortality in Malaysia. The important variables were consistent for all- and natural-cause mortality and sensitivity analyses. However, different most important variables were observed between natural- and external-cause under-five mortality.Heat-related variables, temperature variability, and haze-related variables were consistently prominent for all- and natural-cause under-five mortalities, but not for external-cause.
科研通智能强力驱动
Strongly Powered by AbleSci AI