How Generalizable Are Foundation Models When Applied to Different Demographic Groups and Settings?
基础(证据)
心理学
计算机科学
地理
考古
作者
Zhuxin Xiong,Xiaofei Wang,Yukun Zhou,Pearse A. Keane,Yih‐Chung Tham,Ya Xing Wang,Tien Yin Wong
标识
DOI:10.1056/aics2400497
摘要
RETFound is a retinal image–based foundational artificial intelligence (AI) model that can be fine-tuned to downstream tasks. However, its generalizability to Asian populations remains unclear. In this study, we fine-tuned RETFound on an Asian-specific dataset. We then evaluated the performance of RETFound versus a conventional Vision Transformer model (pretrained on ImageNet) in diagnosing glaucoma and coronary heart disease and predicting the 3-year risk of stroke in an Asian population. When fine-tuned on a "full" dataset, RETFound showed no significant improvement compared with a conventional Vision Transformer model (area under the curves [AUCs] of 0.863, 0.628, and 0.557 vs. 0.853, 0.621, and 0.543, respectively; all P≥0.2). Furthermore, in scenarios with limited training data (fine-tuned on ≤25% of the full dataset), RETFound showed a slight advantage (up to a maximum AUC increase of 0.03). However, these improvements were not statistically significant (all P≥0.2). These findings indicate the challenges foundational AI models face in adapting to diverse demographics, emphasizing the need for more diverse data in current foundation models and the importance of global collaboration on foundation model research.