DeLIVR: a deep learning approach to IV regression for testing nonlinear causal effects in transcriptome-wide association studies

因果推理计算机科学参数统计统计假设检验回归统计推断推论算法人工智能计量经济学数学机器学习统计

作者

Ruoyu He,Mingyang Liu,Zhaotong Lin,Zhong Zhuang,Xiaotong Shen,Wei Pan

出处

期刊：Biostatistics [Oxford University Press]
日期：2023-01-04 卷期号：25 (2): 468-485 被引量：5

链接

nih.govdoi.org

标识

DOI：10.1093/biostatistics/kxac051

摘要

Summary Transcriptome-wide association studies (TWAS) have been increasingly applied to identify (putative) causal genes for complex traits and diseases. TWAS can be regarded as a two-sample two-stage least squares method for instrumental variable (IV) regression for causal inference. The standard TWAS (called TWAS-L) only considers a linear relationship between a gene’s expression and a trait in stage 2, which may lose statistical power when not true. Recently, an extension of TWAS (called TWAS-LQ) considers both the linear and quadratic effects of a gene on a trait, which however is not flexible enough due to its parametric nature and may be low powered for nonquadratic nonlinear effects. On the other hand, a deep learning (DL) approach, called DeepIV, has been proposed to nonparametrically model a nonlinear effect in IV regression. However, it is both slow and unstable due to the ill-posed inverse problem of solving an integral equation with Monte Carlo approximations. Furthermore, in the original DeepIV approach, statistical inference, that is, hypothesis testing, was not studied. Here, we propose a novel DL approach, called DeLIVR, to overcome the major drawbacks of DeepIV, by estimating a related but different target function and including a hypothesis testing framework. We show through simulations that DeLIVR was both faster and more stable than DeepIV. We applied both parametric and DL approaches to the GTEx and UK Biobank data, showcasing that DeLIVR detected additional 8 and 7 genes nonlinearly associated with high-density lipoprotein (HDL) cholesterol and low-density lipoprotein (LDL) cholesterol, respectively, all of which would be missed by TWAS-L, TWAS-LQ, and DeepIV; these genes include BUD13 associated with HDL, SLC44A2 and GMIP with LDL, all supported by previous studies.

求助该文献

最长约 10秒，即可获得该文献文件

DeLIVR: a deep learning approach to IV regression for testing nonlinear causal effects in transcriptome-wide association studies

今日热心研友