命名实体识别
任务(项目管理)
计算机科学
人工智能
分割
自然语言处理
领域(数学分析)
领域(数学)
条件随机场
训练集
文本分割
F1得分
语音识别
机器学习
经济
管理
纯数学
数学分析
数学
作者
Jiakang Li,Ruixia Liu,Lihui Su,Shikai Zhang
摘要
The Chinese Electronic Medical Records (EMR) lacks abundant annotated data. This severely limits the performance of Named Entity Recognition (NER) models in this domain. We propose a Chinese electronic medical record named entity recognition model based on pre-training and multi-task learning (Pt-Mt) to solve this problem. The model first fine-tunes the improved pre-trained model Roberta on different medical datasets, so that Roberta better fits the characteristics of the medical field and can learn features on different datasets. At the same time, the Chinese Word Segmentation (CWS) task is added as an auxiliary task of the NER model for joint training, which enhances the model's ability to distinguish entity boundaries. Finally, the NER of Chinese EMR based on Roberta and multi-task learning has achieved good results on the CCKS2017, CCKS2019, and CCKS2020.
科研通智能强力驱动
Strongly Powered by AbleSci AI