2019年冠状病毒病(COVID-19)
匹配(统计)
医学
严重急性呼吸综合征冠状病毒2型(SARS-CoV-2)
听力学
自然语言处理
疾病
内科学
计算机科学
病理
传染病(医学专业)
作者
Wan In Wei,Cyrus Lap Kwan Leung,Arthur Tang,Edward Braddon McNeil,Samuel Yeung Shan Wong,Kin On Kwok
标识
DOI:10.1016/j.cmi.2023.11.002
摘要
Abstract
Objectives
To investigate the feasibility and performance of Chat Generative Pretrained Transformer (ChatGPT) in converting symptom narratives into structured symptom labels. Methods
We extracted symptoms from 300 deidentified symptom narratives of COVID-19 patients by a computer-based matching algorithm (the standard), and prompt engineering in ChatGPT. Common symptoms were those with a prevalence >10% according to the standard, and similarly less common symptoms were those with a prevalence of 2–10%. The precision of ChatGPT was compared with the standard using sensitivity and specificity with 95% exact binomial CIs (95% binCIs). In ChatGPT, we prompted without examples (zero-shot prompting) and with examples (few-shot prompting). Results
In zero-shot prompting, GPT-4 achieved high specificity (0.947 [95% binCI: 0.894–0.978]—1.000 [95% binCI: 0.965–0.988, 1.000]) for all symptoms, high sensitivity for common symptoms (0.853 [95% binCI: 0.689–0.950]—1.000 [95% binCI: 0.951–1.000]), and moderate sensitivity for less common symptoms (0.200 [95% binCI: 0.043–0.481]—1.000 [95% binCI: 0.590–0.815, 1.000]). Few-shot prompting increased the sensitivity and specificity. GPT-4 outperformed GPT-3.5 in response accuracy and consistent labelling. Discussion
This work substantiates ChatGPT's role as a research tool in medical fields. Its performance in converting symptom narratives to structured symptom labels was encouraging, saving time and effort in compiling the task-specific training data. It potentially accelerates free-text data compilation and synthesis in future disease outbreaks and improves the accuracy of symptom checkers. Focused prompt training addressing ambiguous descriptions impacts medical research positively.
科研通智能强力驱动
Strongly Powered by AbleSci AI