BioInstruct: instruction tuning of large language models for biomedical natural language processing

计算机科学任务（项目管理）生物医学文本挖掘领域（数学分析）自然语言处理自然语言人工智能自然（考古学）语言模型文本挖掘数学历史数学分析经济考古管理

作者

Hieu Tran,Zhichao Yang,Zonghai Yao,Hong Yu

出处

期刊：Journal of the American Medical Informatics Association [Oxford University Press]
日期：2024-06-04 卷期号：31 (9): 1821-1832 被引量：10

链接

arxiv.org arxiv.org nih.govdoi.org

标识

DOI：10.1093/jamia/ocae122

摘要

Abstract Objectives To enhance the performance of large language models (LLMs) in biomedical natural language processing (BioNLP) by introducing a domain-specific instruction dataset and examining its impact when combined with multi-task learning principles. Materials and Methods We created the BioInstruct, comprising 25 005 instructions to instruction-tune LLMs (LLaMA 1 and 2, 7B and 13B version). The instructions were created by prompting the GPT-4 language model with 3-seed samples randomly drawn from an 80 human curated instructions. We employed Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning. We then evaluated these instruction-tuned LLMs on several BioNLP tasks, which can be grouped into 3 major categories: question answering (QA), information extraction (IE), and text generation (GEN). We also examined whether categories (eg, QA, IE, and generation) of instructions impact model performance. Results and Discussion Comparing with LLMs without instruction-tuned, our instruction-tuned LLMs demonstrated marked performance gains: 17.3% in QA on average accuracy metric, 5.7% in IE on average F1 metric, and 96% in Generation tasks on average GPT-4 score metric. Our 7B-parameter instruction-tuned LLaMA 1 model was competitive or even surpassed other LLMs in the biomedical domain that were also fine-tuned from LLaMA 1 with vast domain-specific data or a variety of tasks. Our results also show that the performance gain is significantly higher when instruction fine-tuning is conducted with closely related tasks. Our findings align with the observations of multi-task learning, suggesting the synergies between 2 tasks. Conclusion The BioInstruct dataset serves as a valuable resource and instruction tuned LLMs lead to the best performing BioNLP applications.

求助该文献

最长约 10秒，即可获得该文献文件

BioInstruct: instruction tuning of large language models for biomedical natural language processing

今日热心研友