计算机科学
命名实体识别
人工智能
解析
自然语言处理
生物医学文本挖掘
实体链接
关系抽取
图形
依存语法
任务(项目管理)
信息抽取
文本挖掘
知识库
管理
理论计算机科学
经济
作者
Yumeng Yang,Hongfei Lin,Zhihao Yang,Yijia Zhang,Di Zhao,Shuaiheng Huai
标识
DOI:10.1016/j.jbi.2023.104317
摘要
Named entity recognition is a key task in text mining. In the biomedical field, entity recognition focuses on extracting key information from large-scale biomedical texts for the downstream information extraction task. Biomedical literature contains a large amount of long-dependent text, and previous studies use external syntactic parsing tools to capture word dependencies in sentences to achieve nested biomedical entity recognition. However, the addition of external parsing tools often introduces unnecessary noise to the current auxiliary task and cannot improve the performance of entity recognition in an end-to-end way. Therefore, we propose a novel automatic dependency parsing approach, namely the ADPG model, to fuse syntactic structure information in an end-to-end way to recognize biomedical entities. Specifically, the method is based on a multilayer Tree-Transformer structure to automatically extract the semantic representation and syntactic structure in long-dependent sentences, and then combines a multilayer graph attention neural network (GAT) to extract the dependency paths between words in the syntactic structure to improve the performance of biomedical entity recognition. We evaluated our ADPG model on three biomedical domain and one news domain datasets, and the experimental results demonstrate that our model achieves state-of-the-art results on these four datasets with certain generalization performance. Our model is released on GitHub: https://github.com/Yumeng-Y/ADPG.
科研通智能强力驱动
Strongly Powered by AbleSci AI