嵌入
计算机科学
双曲空间
理论计算机科学
语义学(计算机科学)
图形
编码
欧几里得空间
欧几里德距离
数据挖掘
人工智能
数学
纯数学
基因
化学
生物化学
程序设计语言
作者
Nan Li,Zhihao Yang,Yongrong Yang,Jian Wang,Hongfei Lin
标识
DOI:10.1016/j.jbi.2023.104503
摘要
Predicting relationships between biological entities can greatly benefit important biomedical problems. Previous studies have attempted to represent biological entities and relationships in Euclidean space using embedding methods, which evaluate their semantic similarity by representing entities as numerical vectors. However, the limitation of these methods is that they cannot prevent the loss of latent hierarchical information when embedding large graph-structured data into Euclidean space, and therefore cannot capture the semantics of entities and relationships accurately. Hyperbolic spaces, such as Poincaré ball, are better suited for hierarchical modeling than Euclidean spaces. This is because hyperbolic spaces exhibit negative curvature, causing distances to grow exponentially as they approach the boundary. In this paper, we propose HEM, a hyperbolic hierarchical knowledge graph embedding model to generate vector representations of bio-entities. By encoding the entities and relations in the hyperbolic space, HEM can capture latent hierarchical information and improve the accuracy of biological entity representation. Notably, HEM can preserve rich information with a low dimension compared with the methods that encode entities in Euclidean space. Furthermore, we explore the performance of HEM in protein–protein interaction prediction and gene-disease association prediction tasks. Experimental results demonstrate the superior performance of HEM over state-of-the-art baselines. The data and code are available at : https://github.com/Nan-ll/HEM.
科研通智能强力驱动
Strongly Powered by AbleSci AI