生命银行
孟德尔随机化
遗传力
观测误差
一致性
I类和II类错误
统计
心理学
遗传学
生物
遗传变异
数学
基因
基因型
作者
Tabea Schoeler,Jean‐Baptiste Pingault,Zoltán Kutalik
标识
DOI:10.1038/s41562-024-02061-w
摘要
Abstract Although the use of short self-report measures is common practice in biobank initiatives, such a phenotyping strategy is inherently prone to reporting errors. To explore challenges related to self-report errors, we first derived a reporting error score in the UK Biobank (UKBB; n = 73,127), capturing inconsistent self-reporting in time-invariant phenotypes across multiple measurement occasions. We then performed genome-wide scans on the reporting error score, applied downstream analyses (linkage disequilibrium score regression and Mendelian randomization) and compared its properties to the UKBB participation propensity. Finally, we improved phenotype resolution for 24 measures and inspected the changes in genomic findings. We found that reporting error was present across all 33 assessed self-report measures, with repeatability levels as low as 47% (childhood body size). Reporting error was not independent from UKBB participation, evidenced by the negative genetic correlation between the two outcomes ( r g = −0.77), their shared causes (for example, education) and the loss in self-report accuracy following participation bias correction. Across all analyses, the impact of reporting error ranged from reduced power (for example, for gene discovery) to biased estimates (for example, if present in the exposure variable) and attenuation of genome-wide quantities (for example, 21% relative attenuation in SNP heritability for childhood height). Our findings highlight that both self-report accuracy and selective participation are competing biases and sources of poor reproducibility for biobank-scale research.
科研通智能强力驱动
Strongly Powered by AbleSci AI