计算机科学
变压器
人工智能
自然语言处理
电气工程
工程类
电压
作者
Zhigang Jin,Xiaoyong He,Xiaodong Wu,Xianfeng Zhao
标识
DOI:10.1016/j.eswa.2022.118385
摘要
Named entity recognition (NER) plays an important role in many downstream tasks of natural language processing, such as knowledge extraction and information retrieval. NER of Chinese is more challenging than that of English due to lack of the explicit word boundary. Features augmentation is a potential way to improve NER model of Chinese. Pre-trained models can implicitly preserve prior knowledge with additional features. This paper proposes a hybrid Transformer approach, which first utilize the fused additional features embeddings (e.g. char embeddings, bigram embeddings, lattice embeddings and BERT embeddings) as distributed representations to augment the representation ability of model. In addition, a new training strategy named DF strategy is proposed to efficiently fine-tune Bidirectional Encoder Representations from Transformers (BERT) and other embeddings in balance. Then, the proposed model can perceive the relations of features by introducing relative position embeddings to an additional adapted Transformer encoder. Lastly, a standard Conditional Random Field is used to alleviate the obvious tag errors. The proposed model is applied to four representative Chinese datasets to investigate its performance. Experiments results show that the proposed model outperforms the other popular models in terms of accuracy. The proposed BL-BTC model can effectively improve the recognition performance of formal and informal texts.
科研通智能强力驱动
Strongly Powered by AbleSci AI