概化理论
统计能力
样本量测定
计算机科学
稳健性(进化)
神经影像学
外部有效性
交叉验证
偏移量(计算机科学)
样品(材料)
数据挖掘
功率(物理)
再现性
人工智能
机器学习
统计
心理学
数学
生物化学
化学
物理
色谱法
量子力学
精神科
基因
程序设计语言
作者
Matthew Rosenblatt,Link Tejavibulya,Chris C. Camp,Rongtao Jiang,Margaret L. Westwater,Stephanie Noble,Dustin Scheinost
标识
DOI:10.1101/2023.10.25.563971
摘要
Identifying reproducible and generalizable brain-phenotype associations is a central goal of neuroimaging. Consistent with this goal, prediction frameworks evaluate brain-phenotype models in unseen data. Most prediction studies train and evaluate a model in the same dataset. However, external validation, or the evaluation of a model in an external dataset, provides a better assessment of robustness and generalizability. Despite the promise of external validation and calls for its usage, the statistical power of such studies has yet to be investigated. In this work, we ran over 60 million simulations across several datasets, phenotypes, and sample sizes to better understand how the sizes of the training and external datasets affect statistical power. We found that prior external validation studies used sample sizes prone to low power, which may lead to false negatives and effect size inflation. Furthermore, increases in the external sample size led to increased simulated power directly following theoretical power curves, whereas changes in the training dataset size offset the simulated power curves. Finally, we compared the performance of a model within a dataset to the external performance. The within-dataset performance was typically within
科研通智能强力驱动
Strongly Powered by AbleSci AI