计算机科学
人工智能
图形
深度学习
卷积神经网络
人工神经网络
蛋白质结构预测
困惑
模式识别(心理学)
理论计算机科学
算法
蛋白质结构
语言模型
生物
生物化学
作者
Xing Zhang,Yin Hong-mei,Fei Ling,Jian Zhan,Yaoqi Zhou
标识
DOI:10.1371/journal.pcbi.1011330
摘要
Recent advances in deep learning have significantly improved the ability to infer protein sequences directly from protein structures for the fix-backbone design. The methods have evolved from the early use of multi-layer perceptrons to convolutional neural networks, transformers, and graph neural networks (GNN). However, the conventional approach of constructing K-nearest-neighbors (KNN) graph for GNN has limited the utilization of edge information, which plays a critical role in network performance. Here we introduced SPIN-CGNN based on protein contact maps for nearest neighbors. Together with auxiliary edge updates and selective kernels, we found that SPIN-CGNN provided a comparable performance in refolding ability by AlphaFold2 to the current state-of-the-art techniques but a significant improvement over them in term of sequence recovery, perplexity, deviation from amino-acid compositions of native sequences, conservation of hydrophobic positions, and low complexity regions, according to the test by unseen structures, “hallucinated” structures and diffusion models. Results suggest that low complexity regions in the sequences designed by deep learning, for generated structures in particular, remain to be improved, when compared to the native sequences.
科研通智能强力驱动
Strongly Powered by AbleSci AI