ChatGPT: The transformative influence of generative AI on science and healthcare

转化式学习 生成语法 医疗保健 心理学 社会学 政治学 计算机科学 人工智能 教育学 法学
作者
Julian Varghese,Julius Chapiro
出处
期刊:Journal of Hepatology [Elsevier]
卷期号:80 (6): 977-980 被引量:42
标识
DOI:10.1016/j.jhep.2023.07.028
摘要

"To improve is to change; to be perfect is to change often." ― Winston S. Churchill. In a time and age where technology is evolving at a sometimes incomprehensibly rapid pace, the liver community must grow with its challenges and adjust our practices to transformative influences on our science and practice. The Editorial Board of the Journal of Hepatology has previously responded to novel developments in Artificial Intelligence (AI) by including experts in the field into the editorial board. Publications utilizing AI technology are no longer uncommon in our journal and have become among the most highly debated and possibly practice-changing papers across all disciplines united by our focus on liver disease. As AI is rapidly evolving, this expert paper will focus on large language models and their possible impact on our research practice and clinical outlook, outlining both challenges and opportunities in the field. Generative AI creates new content, such as text, images, or music, based on patterns in the data it has been trained on. If Generative AI utilizes Large Language Models, it involves training on extensive text datasets and employing AI models with a large number of model parameters. ChatGPT (General Pretrained Transformers) is a popular example of a Generative AI that utilizes Large Language Models. It was developed by OpenAI, version 3.5 released in November 2022, then updated to its superior version 4.0 – called GPT-4 - in March 2023 and version 5 expected to be released in the near future. Figure 1 illustrates the terminology within the overarching concept of Artificial Intelligence (AI). The term AI was coined by McCarthy in the 1950s [1]Shannon CE, McCarthy J, Ashby WR. Automata studies. Princeton University Press. 1956;Google Scholar and describes a system that can mimic human behavior. This can be realized via expert-driven rule-based systems or by data-driven training via Machine Learning. The latter is capable of accomplishing classification tasks based on predetermined categories and features, e.g. recognizing patterns in data that have been previously labeled by an expert on a training data set. Deep Learning architectures are a specific subtype of Machine Learning that utilize an Artificial Neural Network design characterized by an input layer for data intake, multiple hidden layers of data analyses, and finally an output layer that characterizes the original data according to either a predetermined classification task or outlines freshly identified patterns in data that have not yet been identified by the human observer. While classical supervised Machine or Deep Learning requires labeled training data and provides a reduced fixed output like a numerical prediction (e.g. "this patient with Hepatocellular carcinoma will benefit from immunotherapy with 90% certainty"), unsupervised Neural Networks do not require labelled data and are able to detect previously unknown patterns in data. An advancement is semi-supervised Generative AI, which is trained on unlabeled data and then fine-tuned for specific supervised tasks. It can create more complex output based on input prompts and – in its most advanced version – may generate entirely new data contexts. For instance, ChatGPT utilizes transformer Neural Networks, which are pre-trained on unlabeled large text corpora to acquire a comprehensive understanding of language patterns. At this stage it is called a Large Langue Model. The model is then further fine-tuned in order to answer user prompts. The principal approach of pre-training and finetuning can be adapted for different tasks beyond text data. For instance, new realistic images can be generated based on user-prompts, as showcased by Dall-E, another creation of OpenAI. [2]Marcus G, Davis E, Aaronson S. A very preliminary analysis of DALL-E 2 [Internet]. arXiv; 2022 [cited 2023 Jun 24]. Available from: http://arxiv.org/abs/2204.13807Google Scholar. Based on the aforementioned recent advancement of Artificial Neural Networks, which are powered by increasing hardware resources and the immense amount of available human-written text that serves as training basis, ChatGPT generates human-like text better than most other Natural Language Processing tools [3]Deng J. Lin Y. The Benefits and Challenges of ChatGPT: An Overview.Frontiers in Computing and Intelligent Systems. 2022; 2: 81-83Crossref Google Scholar. It is highly effective in generating well-formulated text, which will be disruptive in the way we are going to create text, including academic writing in the biomedical domain. Users can type in raw text containing bullet points or subheadings, which can be processed by ChatGPT to coherent text, allowing the user to carefully refine and thereby creating well-written and meaningful text. In particular, this will be helpful for non-native speakers by enabling technology-based democratization of language skills. The advanced processing of language syntax has the ability to improve upon existing language input and semantically add to the initial prompt while generating explanations, references or – in the best case scenario – fact-check the existing reasoning as offered by the primary end user. In addition to the advantages of promising text completion or supporting writing skills, some early applications have been systematically tested for specific tasks. These include supporting computer programmers in advanced tasks such as converting or generating programming source code, proof-reading code and bug-fixing ,[4]Surameery N.M.S. Shakor M.Y. Use Chat GPT to Solve Programming Bugs.International Journal of Information Technology & Computer Engineering (IJITC). 2023 Jan 28; 3 (ISSN : 2455-5290): 17-22Crossref Google Scholar, classifying hate-speeches on twitter [5]Huang F, Kwak H, An J. Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate Speech. In: Companion Proceedings of the ACM Web Conference 2023 [Internet]. New York, NY, USA: Association for Computing Machinery; 2023 [cited 2023 May 28]. p. 294–7. (WWW '23 Companion). Available from: https://dl.acm.org/doi/10.1145/3543873.3587368Google Scholar or passing the USMLE medical exam [6]Kung T.H. Cheatham M. Medenilla A. Sillos C. Leon L.D. Elepaño C. et al.Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.PLOS Digital Health. 2023 Feb 9; 2e0000198Crossref Google Scholar. The performances range from moderate to high, which is impressive considering that ChatGPT was never fine-tuned for these specialized applications, is limited to data input before the year 2022 and shows its applicability as an easy-to use general purpose AI. It remains to be seen if future variations of LLMs can be fine-tuned such that these tasks can solved more accurately and if these models could perform en par with established machine learning models, which are already trained on high quality structured data. One key advantage that LLMs could have compared to classical supervised Deep Learning is their chatbox functionality, which can communicate with user questions more intuitively than a complex graphical user interface that requires structured input by the user. This type of free-text communication has the potential to improve human-machine interaction, e.g. increasing accessibility of technology for patients, in particular impaired or blind people and health professionals or researchers to information resources. The accuracy of such systems - as with all Machine Learning systems - heavily relies on previous training data that needs to be representative for the aimed task. It is assumed that ChatGPT was trained on website materials and books, thus it could amplify bias by the content creators and ignore knowledge or opinions of marginalized communities [7]Yang H. How I use ChatGPT responsibly in my teaching. Nature [Internet]. 2023 Apr 12 [cited 2023 Apr 16]; Available from: https://www.nature.com/articles/d41586-023-01026-9Google Scholar. Missing transparency has been frequently mentioned [8]ChatGPT: five priorities for research [Internet]. [cited 2023 Apr 16]. Available from: https://www.nature.com/articles/d41586-023-00288-7.Google Scholar due to the lack of information disclosed regarding the details of pre-training data used and the supervised fine-tuning tasks that lead to the remarkable performance of ChatGPT. For a specific output, ChatGPT does not provide explainability of the generation process nor does it disclose the level of its uncertainty. When faced with topics that the model has not received adequate training or supervision for, it is probable that the generated output will be fabricated, yet delivered with a strong sense of certainty, a phenomenon known as "Artificial Hallucination". According to OpenAI's official website ,[9]What is ChatGPT? | OpenAI Help Center [Internet]. [cited 2023 Jun 2]. Available from: https://help.openai.com/en/articles/6783457-what-is-chatgpt.Google Scholar, information processed by ChatGPT includes the history of conversations, account details such as name and contact information. Information will be transferred from the user's location to OpenAIs facilities and servers in the United States. Conversations can be reviewed and used for further training. Users are explicitly advised against entering sensitive information. Medical information extracted from electronic medical records should be handled with high caution and it is advisable to consult with the local IT department any use of new information technology or web services that operates outside of the care facilities. With respect to medical data safety, approaches such as federated learning are currently being discussed as possible solutions to expose ChatGPT to clinical data. While this measure may be effective to prevent protected health information to be widely disseminated, there is currently no safeguard in place to effectively monitor the quality of the learning experience and to direct the future ChatGPT output towards a verifiable and accurate level of medical information. While ChatGPT can accelerate the writing process one has to be concerned regarding fabricated output and amplification of bias. An illustrative example of Artificial Hallucination in academic writing appears when asking ChatGPT to provide literature sources for scientific statements or one of its generated answers. Provided sources are often fabricated with non-existing titles or Pubmed IDs (PMIDs) [10]Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: Implications in Scientific Writing. Cureus [Internet]. 2023 Feb [cited 2023 Apr 16];15(2). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9939079/Google Scholar. Therefore, literature review and citing remains a critical part of scientific writing, which should be done manually and could still be supported by other existing interpretable IT-tools. If a significant amount of author's text is generated via an LLM, this should be mentioned as further note or acknowledgment of the exact LLMs version. It will depend on the editorial policies of each publication organ to specify what a significant amount means. The future will show if or to which degree LLM use is acceptable at all for scientific conferences or journals. For instance, while the Science Journal [11]ChatGPT is fun, but not an author | Science [Internet]. [cited 2023 Apr 16]. Available from: https://www.science.org/doi/10.1126/science.adg7879?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed.Google Scholar and the International Conference for Machine Learning [12]ICML 2023 [Internet]. [cited 2023 Jun 24]. Available from: https://icml.cc/Conferences/2023/llm-policyGoogle Scholar have recently published a ban for each submission using ChatGPT or other LLMs, many other Journals are currently considering updates to their editorial policies. In our opinion, LLMs are here to stay as the aforementioned benefits and the growing use cannot be disregarded. The issues have to be taken with caution as required for any new emerging technology. An additional currently under-explored application of ChatGPT may be its utility for journal editors, reviewers and authors addressing requested revisions. As most journal editors and reviewers are facing challenges with rapidly growing numbers of submissions, future applications of LLMs may bring substantial relief for tasks requiring a high degree of effort with a low degree of expertise such as proof-reading for proper syntax, adjusting journal-specific styles and assuring semantic integrity of the submitted articles. While it should be emphasized that LLMs may never be used to replace a professional judgment call of a seasoned peer-reviewer with respect to the level of credibility and quality of the submitted original research, other highly time-consuming tasks that bear a lower-level of risk may very well be "outsourced" to LLMs if an appropriate level of supervision can be guaranteed. LLMs have enormous potential for improving communication in health care in various ways. If appropriately trained and validated, such models may excel at patient education due to their unparalleled ability to provide varying adaptable degrees of medical information to patients in an interactive iterative manner. This feature may significantly improve access to care and allow for an improved resource utilization with respect to the interactions between patients and healthcare professionals. While LLMs may never replace the doctor-patient-relationship, if properly trained, sub-specialized LLMs have the potential to become capable extenders of physicians particularly for under-served populations. From providing reader's digest or "simple language" summaries of the most recently published research results and disease management guidelines towards assisting patients with gathering information on upcoming procedures, diagnosed conditions or management of prescriptions – the possibilities are broad. For physicians, LLMs may potentially play the role of a digital interpreter and interlocutor between increasingly complex electronic medical record systems and provide enhancements for work-flows related to note writing, report dictation, data extraction and input. LLMs may quickly become the bridge, interpreting complex or lengthy sub-specialty reports e.g. from pathology or radiology for patients and general practitioners, easing the linguistic barriers and providing language at the level requested by the end user. Generative AI and LLMs in particular are no exception to the widely debated risks of applying AI in healthcare. Aside from the aforementioned issues with Artificial Hallucination and risks of fabricated content, the black-box design of virtually all LLM algorithms hinder physicians from understanding the reasoning behind the output of the algorithm, ultimately undermining our trust in ChatGPT as a potentially useful instrument in science and clinical practice [13]Cutillo C.M. Sharma K.R. Foschini L. Kundu S. Mackintosh M. Mandl K.D. et al.Machine intelligence in healthcare—perspectives on trustworthiness, explainability, usability, and transparency.NPJ digital medicine. 2020; 3: 47Crossref PubMed Scopus (0) Google Scholar This problem has been taken serious by various national and international governing bodies, including the European Parliament. The General Data Protection Regulation and the recent AI act draft [14]Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL LAYING DOWN HARMONISED RULES ON ARTIFICIAL INTELLIGENCE (ARTIFICIAL INTELLIGENCE ACT) AND AMENDING CERTAIN UNION LEGISLATIVE ACTS [Internet]. 2021. Available from: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52021PC0206Google Scholar from the EU strongly endorses the right of doctors and patients to receive an explanation for results presented by AI applications, particularly in health care in which AI can be regarded as high risk application. As such, EU legislative documents suggest that black-box decision-making might violate medical ethics and undermine patient rights to autonomy and informed consent. Therefore, all applications of generative AI such as ChatGPT to clinical patient care will have to be safeguarded by mandatory elements of the algorithms assuring full transparency of the decision making [15]Reddy S. Explainability and artificial intelligence in medicine.The Lancet Digital Health. 2022; 4: e214-e215Abstract Full Text Full Text PDF PubMed Scopus (0) Google Scholar. Other regulatory aspects for AI will also remain true. As mentioned, Online LLMs should not be used for processing patient information due to privacy of sensible patient data. Even if the software version is implemented within the internal hospital network, such software will require regulatory approval when used for medical purposes. Each software technology that aims to support clinical decision making and therefore has an effect on diagnosis, treatment or prevention of disease fulfils the status as medical device. Such software, also called software as medical device (Samd), is strictly regulated in different regions e.g. by the FDA in the US or Medical Device Regulation and the IVDR in the EU. In general, AI software that aims to improve clinical decision making will require high quality and risk management during development and proven benefit while preserving patient safety before acquiring approval. As none of the LLMs are labelled as Samd, their usage should not be taken for routine clinical decision making. Moreover, health professional can be hold accountable when using such unapproved systems as is the case when administering drug agents without approval. ChatGPT has tremendous potential to improve the way how we generate or edit text and will increase accessibility of information resources. The future will generate a variety of future Large Language Models or Generative AI applications, which will be further tailored for user needs and therefore will outperform current versions in specific tasks. The technology is still at the beginning and requires critical human oversight due to amplification bias, Artificial Hallucination and the current lack of model transparency or explainability. Future developments can address these shortcomings by actively involving the scientific community, patient needs and ongoing AI regulations. Despite of their limitations, we are convinced that Large Language Models will play an important role in research activities and thus should be embraced within editorial policies and regulations.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

祝大家在新的一年里科研腾飞
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Owen应助霸霸斌采纳,获得10
1秒前
已知中的未知完成签到 ,获得积分10
1秒前
FashionBoy应助zhou采纳,获得10
1秒前
2秒前
研友_Z6k7B8完成签到 ,获得积分10
2秒前
Jack123发布了新的文献求助20
2秒前
TAboo发布了新的文献求助10
2秒前
2秒前
3秒前
3秒前
海蓝化鲸发布了新的文献求助30
4秒前
烟花应助畅快芝麻采纳,获得10
4秒前
tsw发布了新的文献求助10
4秒前
壮壮完成签到,获得积分10
5秒前
5秒前
yyy完成签到,获得积分10
5秒前
5秒前
怡然的无敌完成签到,获得积分10
6秒前
6秒前
林珍发布了新的文献求助10
6秒前
7秒前
naohai发布了新的文献求助10
7秒前
7秒前
Ftucyctucutct完成签到,获得积分10
8秒前
可爱的函函应助喏晨采纳,获得10
8秒前
9秒前
laoli2022发布了新的文献求助10
9秒前
9秒前
zzz完成签到 ,获得积分10
9秒前
SY发布了新的文献求助10
10秒前
伶俐楷瑞发布了新的文献求助10
10秒前
Lllll发布了新的文献求助10
10秒前
10秒前
皓轩发布了新的文献求助10
11秒前
11秒前
星辰大海应助王建荣采纳,获得10
11秒前
TAboo完成签到,获得积分10
11秒前
11秒前
never发布了新的文献求助10
12秒前
12秒前
高分求助中
Востребованный временем 2500
The Three Stars Each: The Astrolabes and Related Texts 1500
Les Mantodea de Guyane 800
Mantids of the euro-mediterranean area 700
有EBL数据库的大佬进 Matrix Mathematics 500
Plate Tectonics 500
Igneous rocks and processes: a practical guide(第二版) 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 内科学 物理 纳米技术 计算机科学 遗传学 化学工程 基因 复合材料 免疫学 物理化学 细胞生物学 催化作用 病理
热门帖子
关注 科研通微信公众号,转发送积分 3411077
求助须知:如何正确求助?哪些是违规求助? 3014545
关于积分的说明 8864373
捐赠科研通 2702074
什么是DOI,文献DOI怎么找? 1481422
科研通“疑难数据库(出版商)”最低求助积分说明 684839
邀请新用户注册赠送积分活动 679351