逻辑后果
功能(生物学)
计算机科学
人工智能
计算生物学
生物
遗传学
作者
Maxat Kulmanov,Francisco J. Guzmán‐Vega,Paula Duek Roggli,Lydie Lane,Stefan T. Arold,Robert Hoehndorf
标识
DOI:10.1038/s42256-024-00795-w
摘要
Abstract The Gene Ontology (GO) is a formal, axiomatic theory with over 100,000 axioms that describe the molecular functions, biological processes and cellular locations of proteins in three subontologies. Predicting the functions of proteins using the GO requires both learning and reasoning capabilities in order to maintain consistency and exploit the background knowledge in the GO. Many methods have been developed to automatically predict protein functions, but effectively exploiting all the axioms in the GO for knowledge-enhanced learning has remained a challenge. We have developed DeepGO-SE, a method that predicts GO functions from protein sequences using a pretrained large language model. DeepGO-SE generates multiple approximate models of GO, and a neural network predicts the truth values of statements about protein functions in these approximate models. We aggregate the truth values over multiple models so that DeepGO-SE approximates semantic entailment when predicting protein functions. We show, using several benchmarks, that the approach effectively exploits background knowledge in the GO and improves protein function prediction compared to state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI