生成语法
语言学
传统医学
计算机科学
医学
人工智能
哲学
作者
Yuxiang Dai,Xin Shao,Jinlu Zhang,Yulong Chen,Qian Chen,Jie Liao,Fangde Chi,Junhua Zhang,Xiaohui Fan
标识
DOI:10.1016/j.phrs.2024.107530
摘要
The utilization of ground-breaking large language models (LLMs) accompanied with dialogue system has been progressively prevalent in the medical domain. Nevertheless, the expertise of LLMs in Traditional Chinese Medicine (TCM) remains restricted despite several TCM LLMs proposed recently. Herein, we introduced TCMChat (https://xomics.com.cn/tcmchat), a generative LLM with pre-training (PT) and supervised fine-tuning (SFT) on large-scale curated TCM text knowledge and Chinses Question-Answering (QA) datasets. In detail, we first compiled a customized collection of six scenarios of Chinese medicine as the training set by text mining and manual verification, involving TCM knowledgebase, choice question, reading comprehension, entity extraction, medical case diagnosis, and herb or formula recommendation. Next, we subjected the model to PT and SFT taking the Baichuan2-7B-Chat as the foundation model. The benchmarking datasets and cases studies further demonstrate the superior performance of TCMChat in comparison to existing models. Our code, data and model are publicly released on GitHub (https://github.com/ZJUFanLab/TCMChat) and HuggingFace (https://huggingface.co/ZJUFanLab), providing a high-quality knowledgebase for the research of TCM modulization with a user-friendly dialogue web tool.
科研通智能强力驱动
Strongly Powered by AbleSci AI