假阳性悖论
随机效应模型
协变量
全基因组关联研究
线性模型
混淆
生物
广义线性混合模型
统计
固定效应模型
多重比较问题
亲属关系
遗传关联
人口
广义线性模型
计算生物学
数学
遗传学
荟萃分析
医学
内科学
单核苷酸多态性
基因
环境卫生
基因型
政治学
法学
面板数据
作者
Xiaolei Liu,Meng Huang,Bin Fan,Edward S. Buckler,Zhiwu Zhang
出处
期刊:PLOS Genetics
日期:2016-02-01
卷期号:12 (2): e1005767-e1005767
被引量:870
标识
DOI:10.1371/journal.pgen.1005767
摘要
False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days.
科研通智能强力驱动
Strongly Powered by AbleSci AI