条件随机场
计算机科学
人工智能
自然语言处理
词(群论)
序列标记
背景(考古学)
集合(抽象数据类型)
命名实体识别
领域(数学)
特征(语言学)
面子(社会学概念)
代表(政治)
模式识别(心理学)
特征向量
数学
管理
任务(项目管理)
几何学
政治学
纯数学
法学
程序设计语言
生物
社会科学
经济
语言学
古生物学
社会学
哲学
政治
标识
DOI:10.1109/icesit53460.2021.9696907
摘要
Aiming at the problem that the corpus of drug-related fields is not rich and the relevant information of drug-related personnel is insufficient, this paper constructs a 600,000-word-scale drug-related text data set, and proposes a named entity recognition method for drug-related personnel based on ELECTRA-BiLSTM-CRF. First input the labeled text into the ELECTRA pre-training language model to obtain a word vector with better semantic representation; then input the trained word vector into the bidirectional long short-term memory (BiLSTM) network to extract the context feature; finally, the best predicted label sequence is obtained through the conditional random field(CRF). The performance of this model was evaluated on the drug-related text data set. The experimental results showed that the F1 value of the ELECTRA-BiLSTM-CRF model reached 94%, which was better than the BERT-BiLSTM-CRF, BERT-CRF, and BiLSTM-CRF models, which proved this model has a good effect on the named entity recognition of drug-related personnel.
科研通智能强力驱动
Strongly Powered by AbleSci AI