协变量
因果推理
倾向得分匹配
Lasso(编程语言)
混淆
计算机科学
结果(博弈论)
推论
数据集
统计
计量经济学
数学
人工智能
机器学习
万维网
数理经济学
作者
Qian Gao,Yu Zhang,Jie Liang,Hongwei Sun,Tong Wang
摘要
Propensity score (PS) methods are popular when estimating causal effects in non-randomized studies. Drawing causal conclusion relies on the unconfoundedness assumption. This assumption is untestable and is considered more plausible if a large number of pre-treatment covariates are included in the analysis. However, previous studies have shown that including unnecessary covariates into PS models can lead to bias and efficiency loss. With the ever-increasing amounts of available data, such as the omics data, there is often little prior knowledge of the exact set of important covariates. Therefore, variable selection for causal inference in high-dimensional settings has received considerable attention in recent years. However, recent studies have focused mainly on binary treatments. In this study, we considered continuous treatments and proposed the generalized outcome-adaptive LASSO (GOAL) to select covariates that can provide an unbiased and statistically efficient estimation. Simulation studies showed that when the outcome model was linear, the GOAL selected almost all true confounders and predictors of outcome and excluded other covariates. The accuracy and precision of the estimates were close to ideal. Furthermore, the GOAL is robust to model misspecification. We applied the GOAL to seven DNA methylation datasets from the Gene Expression Omnibus database, which covered four brain regions, to estimate the causal effects of epigenetic aging acceleration on the incidence of Alzheimer's disease.
科研通智能强力驱动
Strongly Powered by AbleSci AI