Data-Driven Variable Decomposition for Treatment Effect Estimation

观察研究混淆因果推理符号差异（会计）推论数学变量（数学）统计倾向得分匹配计算机科学算法计量经济学人工智能数学分析业务会计算术

作者

Kun Kuang,Peng Cui,Hao Zou,Bo Li,Jianrong Tao,Fei Wu,Shiqiang Yang

出处

期刊：IEEE Transactions on Knowledge and Data Engineering [IEEE Computer Society]
日期：2020-07-03 卷期号：34 (5): 2120-2134 被引量：12

标识

DOI：10.1109/tkde.2020.3006898

摘要

Causal Inference plays an important role in decision making in many fields, such as social marketing, healthcare, and public policy. One fundamental problem in causal inference is the treatment effect estimation in observational studies when variables are confounded. Controlling for confounding effects is generally handled by propensity score. But it treats all observed variables as confounders and ignores the adjustment variables, which have no influence on treatment but are predictive of the outcome. Recently, it has been demonstrated that the adjustment variables are effective in reducing the variance of the estimated treatment effect. However, how to automatically separate the confounders and adjustment variables in observational studies is still an open problem, especially in the scenarios of high dimensional variables, which are common in the big data era. In this paper, we first propose a Data-Driven Variable Decomposition (D

$^2$

VD) algorithm, which can 1) automatically separate confounders and adjustment variables with a data-driven approach, and 2) simultaneously estimate treatment effect in observational studies with high dimensional variables. Under standard assumptions, we theoretically prove that our D

$^2$

VD algorithm can unbiased estimate treatment effect and achieve lower variance than traditional propensity score based methods. Moreover, to address the challenges from high-dimensional variables and nonlinear, we extend our D

$^2$

VD to a non-linear version, namely Nonlinear-D

$^2$

VD (N-D

$^2$

VD) algorithm. To validate the effectiveness of our proposed algorithms, we conduct extensive experiments on both synthetic and real-world datasets. The experimental results demonstrate that our D

$^2$

VD and N-D

$^2$

VD algorithms can automatically separate the variables precisely, and estimate treatment effect more accurately and with tighter confidence intervals than the state-of-the-art methods. We also demonstrated that the top-ranked features by our algorithm have the best prediction performance on an online advertising dataset.

求助该文献

最长约 10秒，即可获得该文献文件

Data-Driven Variable Decomposition for Treatment Effect Estimation

今日热心研友