变压器
计算机科学
拓扑(电路)
人工智能
水准点(测量)
机器学习
算法
理论计算机科学
数学
工程类
电压
电气工程
大地测量学
组合数学
地理
作者
Guo‐Wei Wei,Dong Chen,Jian Liu
出处
期刊:Research Square - Research Square
日期:2024-02-09
标识
DOI:10.21203/rs.3.rs-3640878/v1
摘要
Abstract Pre-trained deep Transformers have had tremendous success in a wide variety of disciplines. However, in computational biology, essentially all Transformers are built upon the biological sequences, which ignores vital stereochemical information and may result in crucial errors in downstream predictions. On the other hand, three-dimensional (3D) molecular structures are incompatible with the sequential architecture of Transformer and natural language processing (NLP) models in general. This work addresses this foundational challenge by a topological Transformer (TopoFormer). TopoFormer is built by integrating NLP and a multiscale topology techniques, the persistent topological hyperdigraph Laplacian (PTHL), which systematically converts intricate 3D protein-ligand complexes at various spatial scales into a NLP-admissible sequence of topological invariants and homotopic shapes. Element-specific PTHLs are further developed to embed crucial physical, chemical, and biological interactions into topological sequences. TopoFormer surges ahead of conventional algorithms and recent deep learning variants and gives rise to exemplary scoring accuracy and superior performance in ranking, docking, and screening tasks in a number of benchmark datasets. The proposed topological sequences can be extracted from all kinds of structural data in data science to facilitate various NLP models, heralding a new era in AI-driven discovery.
科研通智能强力驱动
Strongly Powered by AbleSci AI