人口
入射(几何)
医学
误传
全国健康与营养检查调查
流行病学
人口学
病理
计算机科学
环境卫生
数学
计算机安全
几何学
社会学
作者
Ryan M. Blake,Johnathan A. Khusid
出处
期刊:Journal of Endourology
[Mary Ann Liebert]
日期:2024-05-15
标识
DOI:10.1089/end.2023.0703
摘要
Introduction Artificial intelligence tools such as the large language models (LLMs) Bard and ChatGPT have generated significant research interest. Utilization of these LLMs to study epidemiology of a target population could benefit urologists. We investigated whether Bard and ChatGPT can perform a large-scale calculation of the incidence and prevalence of kidney stone disease. Materials and Methods We obtained reference values from two published studies which used the National Health and Nutrition Examination Survey (NHANES) database to calculate the prevalence and incidence of kidney stone disease. We then tested the capability of Bard and ChatGPT to perform similar calculations using two different methods. First, we instructed the LLMs to access the datasets and independently perform the calculation. Second, we instructed the interfaces to generate customized computer code which could perform the calculation on downloaded datasets. Results While ChatGPT denied the ability to access and perform calculations on the NHANES database, Bard intermittently claimed the ability to do so. Bard provided either accurate results or inaccurate and inconsistent results. For example, Bard's "calculations" for the incidence of kidney stones from 2015-2018 were 2.1% (95% CI: 1.5-2.7), 1.75% (95% CI: 1.6-1.9), and 0.8% (95% CI 0.7-0.9), while the published number was 2.1% (95% CI 1.5–2.7). Bard provided discrete mathematical details of its calculations, however when prompted further, admitted to having obtained the numbers from online sources, including our chosen reference papers, rather than from a de novo calculation. Both LLMs were able to produce code (Python) to use on the downloaded NHANES datasets, however these would not readily execute. Conclusions ChatGPT and Bard are currently incapable of performing epidemiological calculations and lack transparency and accountability. Caution should be used, particularly with Bard, as claims of its capabilities were convincingly misleading, and results were inconsistent.
科研通智能强力驱动
Strongly Powered by AbleSci AI