命名实体识别
计算机科学
Python(编程语言)
自然语言处理
人工智能
段落
判决
Java
信息抽取
命名实体
磁盘格式化
程序设计语言
情报检索
万维网
管理
经济
操作系统
任务(项目管理)
作者
Hemlata Shelar,Gagandeep Kaur,Neha Heda,Poorva Agrawal
标识
DOI:10.1080/0194262x.2020.1759479
摘要
Named entity recognition (NER) is a natural language processing tool for information extraction from unstructured text data such as e-mails, newspapers, blogs, etc. NER is the process of identifying nouns like people, place, organization, etc., that are mentioned in the string of the text, sentence, or paragraph. For building the NER system, many different libraries and natural language processing tools using Java, Python, and Cython languages are available. All these tools have pretrained NER models that can be imported, used and can be modified or customized according to requirements. This paper explains different NLP libraries including Python’s SpaCy, Apache OpenNLP, and TensorFlow. Some of these libraries provide a pre-build NER model that can be customized. The comparison of these libraries is done based on training accuracy, F-score, prediction time, model size, and ease of training. The training and testing data are the same for all the models. When considering the overall performance of all the models, Python’s Spacy gives a higher accuracy and the best result.
科研通智能强力驱动
Strongly Powered by AbleSci AI