相互作用体
计算机科学
计算生物学
数据科学
生物
遗传学
基因
作者
Jing Zhang,Ian R. Humphreys,Jimin Pei,Jinuk Kim,Chulwon Choi,Rongqing Yuan,Jesse Durham,Бо Лю,Hee‐Jung Choi,Minkyung Baek,Ivan Anishchenko,Nick V. Grishin
标识
DOI:10.1101/2024.10.01.615885
摘要
Protein-protein interactions (PPI) are essential for biological function. Recent advances in coevolutionary analysis and Deep Learning (DL) based protein structure prediction have enabled comprehensive PPI identification in bacterial and yeast proteomes, but these approaches have limited success to date for the more complex human proteome. Here, we overcome this challenge by 1) enhancing the coevolutionary signals with 7-fold deeper multiple sequence alignments harvested from 30 petabytes of unassembled genomic data, and 2) developing a new DL network trained on augmented datasets of domain-domain interactions from 200 million predicted protein structures. These advancements allow us to systematically screen through 200 million human protein pairs and predict 18,316 PPIs with an expected precision of 90%, among which 5,578 are novel predictions. 3D models of these predicted PPIs nearly triple the number of human PPIs with accurate structural information, providing numerous insights into protein function and mechanisms of human diseases.
科研通智能强力驱动
Strongly Powered by AbleSci AI