协调
亚马逊雨林
药品
肿瘤科
医学
药理学
生物
哲学
生态学
美学
作者
Jay G. Ronquillo,Brett R. South,Timothy L. Wiemken,Ajit Jadhav,Stephen Watt,Magdia De Jesus,Aida Habtezion
摘要
Medical-domain natural language processing (NLP) and general-domain large language models (LLMs) are powerful computational tools available to the drug development community. These tools have the potential ability to harmonize real-world oncology data, which could accelerate the standardization, integration, and analysis of scalable drug development informatics pipelines, albeit with differing associated costs. This potential use case was evaluated by extracting drugs indicated by the National Cancer Institute for single solid tumor sites from openFDA. Requests to map diagnosis codes to drug label indications were submitted to the Amazon Comprehend Medical NLP service and the OpenAI GPT-3.5 Turbo LLM using a general harmonization prompt (twice) and a cancer-optimized prompt (once). The LLM approach performed similarly to the NLP approach (74.4% vs. 67.9%, P=0.480) and to itself (74.4% vs. 74.4%, P=1.0) when harmonizing the 78 oncology drugs. The cancer-optimized LLM prompts showed greater harmonization accuracy than the NLP ones (89.7% vs. 67.9%, P=0.002), while LLM costs were approximately 3.7 times lower, showing that a general-domain LLM was capable of being more accurate, adaptable, and cost-effective than conventional medical-domain NLP. (Funded by Pfizer, Inc.)
科研通智能强力驱动
Strongly Powered by AbleSci AI