药方
生命银行
管道(软件)
计算机科学
健康信息学
电子健康档案
药物基因组学
健康档案
数据挖掘
信息学
医学
药物警戒
命名实体识别
数据科学
人工智能
药品
情报检索
工程类
生物信息学
药理学
医疗保健
公共卫生
护理部
经济
经济增长
程序设计语言
系统工程
电气工程
任务(项目管理)
生物
作者
Cristobal Colón-Ruíz,Tomas Fitzgerald,Isabel Segura-Bedmar,Ewan Birney,María Herrero-Zazo
出处
期刊:Cold Spring Harbor Laboratory - medRxiv
日期:2023-10-05
被引量:2
标识
DOI:10.1101/2023.10.04.23296481
摘要
Abstract Electronic health record (EHR) systems with prescription data offer vast potential in pharmacoepidemiology and pharmacogenomics. The large amount of clinical data recorded in these systems requires automatic processing to extract relevant information. This paper introduces PRESNER, a name entity recognition (NER) and classification pipeline for EHR prescription data. The pipeline uses the pre-trained transformer Bio-ClinicalBERT fine-tuned on UK Biobank prescription entries manually annotated with medication-related information (drug name, route of administration, pharmaceutical form, strength, and dosage) as the core NER system. Moreover, PRESNER also maps drugs to the Anatomical Therapeutic and Chemical (ATC) classification system and distinguishes between systemic and non-systemic drug products. It outperformed a baseline model combining the state-of-the-art Med7 and a dictionary-based approach from the ChEMBL database with a macro-average F1-score of 0.95 vs 0.71. In addition to UK Biobank prescription data, PRESNER can also be applied to other English prescription datasets, making it a versatile tool for researchers in the field.
科研通智能强力驱动
Strongly Powered by AbleSci AI