作者
Zhiyong Lu,Yifan Peng,Trevor Cohen,Marzyeh Ghassemi,Chunhua Weng,Shubo Tian
摘要
Large language models in biomedicine and health: current research landscape and future directions Large language models (LLMs) are a specialized type of generative artificial intelligence (AI) focused on generating natural language text.These models are developed through extensive training on massive amounts of text data and use deep learning algorithms to generate new text that closely resembles human-generated text.Generative AI methods, including LLMs, are rapidly transforming various domains, including biomedicine and healthcare. [1]2][3][4][5][6] They have already demonstrated remarkable potential as a means to process and analyze large amounts of text, interpret natural language, and generate new content in these domains.For example, Nori et al reported that GPT-4 is able to correctly answer the majority of questions from medical practice licensing exams, comfortably obtaining a passing grade. 7 Similarly, Stribling et al found that this model exceeded the average performance of students in the graduate medical sciences on the majority of examinations, including strong performance on short answer and essay questions. 8 Even though passing the exam is not the same as applying the knowledge in a real-world setting, these results demonstrate that LLMs can generate appropriate multiple-choice and narrative responses to questions framed in natural language.ChatGPT, first released in November 2022, has garnered phenomenal attention from both the scientific community and a broader society.A keyword search of "large language models" OR "ChatGPT" in PubMed returned over 4500 articles that discuss the technology and its implications for various topics, including medical informatics, by the end of June 2024.In addition, LLM-based technologies have already been deployed in several healthcare systems and are offered as integrated products for use in the clinic within vendor electronic health record systems (for thoughts on initial evaluations of an early product, see Garcia et al 9 and Tai-Seale et al 10 ).This rapid adoption of LLMs like ChatGPT brings an unprecedented opportunity to use this novel AI technology to transform healthcare and medicine.2][13][14][15][16][17] With this great potential also comes the need for trustworthy and responsible development and use of technology.As we continue to explore the capabilities of ChatGPT and other LLMs, it is critical to address related ethical, legal, and social issues to ensure that the technology is used in ways that are safe, fair, trustworthy, and beneficial for all.In the context of biomedicine and healthcare, it is particularly important to engage stakeholders, such as AI researchers, developers of data-driven clinical decision support, care providers, and system implementers from both academic medical centers and industry, to ensure responsible use of LLMs for good.To accelerate research and development in this area, we issued a call for submissions in Summer 2023, specifically focusing on the intersection of biomedicine/health and LLMs, and invited contributions on all related aspects.We invited submissions that report on innovative informatics methods development and evaluation, as well as studies that demonstrate the effectiveness/limitations of LLMs methodologies in healthcare.We particularly encouraged submissions that address the challenges and opportunities of this intersection and offer new insights into how these fields can work together to advance healthcare.This editorial provides an overview of the papers accepted in this Focus Issue.We highlight major themes and unique aspects of the research papers in medical LLMs, discuss ongoing challenges, and recommend future research directions.Box 1 lists the relevant large language model terms and abbreviations used in this editorial. Overall statistics of the Focus IssueThis JAMIA Focus Issue on LLMs in biomedicine and health has drawn enthusiasm from many researchers across different research disciplines.In total, we received over 150 submissions from authors in 25 countries and regions across 6 continents worldwide.The rigorous JAMIA peer review process was applied to all submissions, 41 of which were ultimately accepted for publication in the Focus Issue (Table 1).The majority of the accepted papers were authored by authors in North America, followed by those in Asia and Europe (Figure 1A).The Focus Issue highlights the nature of multi-disciplinary collaboration in medical informatics research across the broad JAMIA community.The number of authors per paper varies from 1 to 23, with an average of 7.3.Many papers feature authors with diverse expertise from different departments and organizations.The authors' expertise spans a wide range of fields, including computer science, data science, informatics, statistics, medicine, nursing, clinical services, public health policies, and more.Several papers also demonstrate scientific collaborations across different sectors, including academia, government labs, research institutes, hospitals, and industry.Additionally, a few papers showcase international collaborations among authors.