期刊:Revista De Chimie [Revista de Chimie SRL] 日期:2021-10-28卷期号:72 (4): 52-64被引量:1
标识
DOI:10.37358/rc.21.4.8456
摘要
Biomedical Named Entity Recognition (BNER) is identification of entities such as drugs, genes, and chemicals from biomedical text, which help in information extraction from the domain literature. It would allow extracting information such as drug profiles, similar or related drugs and associations between drugs and their targets. This venue presents opportunities for improvement even though many machine learning methods have been applied. The efficiency can be improved in case of biological related chemical entities as there are varied structure and properties. This new approach combines two state-of-the-art algorithms and aims to improve the performance by applying it to varied sets of features including linguistic, orthographic, Morphological, domain features and local context features. It uses the sequence tagging capability of CRF to identify the boundary of the entity and classification efficiency of SVM to detect subtypes in BNER. The method is tested on two different datasets 1) GENIA and 2) CHEMDNER corpus with different types of entities. The result shows that proposed hybrid method enhances the BNER compared to the conventional machine learning algorithms. Moreover the detailed study of SVM and the methodologies has been discussed clearly. The linear and non linear text classification can be mapped clearly in the section 3. The final section describes the results and the evaluation of the proposed method.