计算机科学
结构化
编码
图形
情报检索
关系抽取
二元关系
任务(项目管理)
知识图
人工智能
自然语言处理
数据科学
信息抽取
理论计算机科学
数学
基因
离散数学
生物化学
财务
经济
化学
管理
作者
Linh D. Dang,Thi-Phuong-Uyen PHAN,Nhung T. H. Nguyen
标识
DOI:10.1016/j.jbi.2023.104460
摘要
While a large number of knowledge graphs have previously been developed by automatically extracting and structuring knowledge from literature, there is currently no such knowledge graph that encodes relationships between food, biochemicals and mental illnesses, even though a large amount of knowledge about these relationships is available in the form of unstructured text in biomedical literature articles. To address this limitation, this article describes the development of GENA - (Graph of mEntal-health and Nutrition Association), a knowledge graph that represents relations between nutrition and mental health, extracted from biomedical abstracts. GENA is constructed from PubMed abstracts that contain keywords relating to chemicals, food, and health. A hybrid named entity recognition (NER) model is firstly applied to these abstracts to identify various entities of interest. Subsequently, a deep syntax-based relation extraction model is used to detect binary relations between the identified entities. Finally, the resulting relations are used to populate the GENA knowledge graph, whose relationships can be accessed in an intuitive and interpretable manner using the Neo4J Database Management System. To evaluate the reliability of GENA, two annotators manually assessed a subset of the extracted relations. The evaluation results show that our methods obtain high precision for the NER task and acceptable precision and relative recall for the relation extraction task. GENA consists of 43,367 relationships that encode information about nutrition and health, of which 94.04% are new relations that are not present in existing ontologies of food and diseases. GENA is constructed based on scientific principles, and has the potential to be used within further applications to contribute towards scientific research within the domain. It is a pioneering knowledge graph in nutrition and mental health, containing a diverse range of relationship types. All of our source code and results are publicly available at https://github.com/ddlinh/gena-db.
科研通智能强力驱动
Strongly Powered by AbleSci AI