缺少数据
插补(统计学)
联营
统计
计算机科学
数据挖掘
数学
人工智能
作者
Katrina Blazek,Anita van Zwieten,Valeria Saglimbene,Armando Teixeira‐Pinto
标识
DOI:10.1016/j.kint.2020.07.035
摘要
Health data are often plagued with missing values that can greatly reduce the sample size if only complete cases are considered for analysis. Furthermore, analyses that ignore missing data have the potential to introduce bias in the parameter estimates. Multiple imputation techniques have been developed to recover the information that would otherwise be lost when excluding observations with missing data and to help minimize bias. However, the validity of analyses using imputed data relies on the imputation model having been correctly specified. The aim of this guide is to aid the reader in the decision-making process when conducting an analysis with multiply imputed data in the context of nephrology research. We discuss (i) missing mechanism assumption, (ii) imputation method, (iii) imputation model, (iv) derived variables, (v) the number of imputed data sets, (vi) diagnostic checks, (vii) analysis and pooling of results, and (viii) reporting the results. This process is demonstrated using data from the National Health and Nutrition Examination Survey to explore the association between hypertension and kidney disease in adults from the general population. Example code is provided for SAS software and the mice package in R. Health data are often plagued with missing values that can greatly reduce the sample size if only complete cases are considered for analysis. Furthermore, analyses that ignore missing data have the potential to introduce bias in the parameter estimates. Multiple imputation techniques have been developed to recover the information that would otherwise be lost when excluding observations with missing data and to help minimize bias. However, the validity of analyses using imputed data relies on the imputation model having been correctly specified. The aim of this guide is to aid the reader in the decision-making process when conducting an analysis with multiply imputed data in the context of nephrology research. We discuss (i) missing mechanism assumption, (ii) imputation method, (iii) imputation model, (iv) derived variables, (v) the number of imputed data sets, (vi) diagnostic checks, (vii) analysis and pooling of results, and (viii) reporting the results. This process is demonstrated using data from the National Health and Nutrition Examination Survey to explore the association between hypertension and kidney disease in adults from the general population. Example code is provided for SAS software and the mice package in R.
科研通智能强力驱动
Strongly Powered by AbleSci AI