计算机科学
人工智能
概化理论
变压器
机器学习
小分子
计算
算法
化学
数学
物理
生物化学
量子力学
电压
统计
作者
Alexander Kroll,Sahasra Ranjan,Martin J. Lercher
标识
DOI:10.1101/2023.08.21.554147
摘要
ABSTRACT Most drugs are small molecules, with their activities typically arising from interactions with protein targets. Accurate predictions of these interactions could greatly accelerate pharmaceutical research. Current machine learning models designed for this task have a limited ability to generalize beyond the proteins used for training. This limitation is likely due to a lack of information exchange between the protein and the small molecule during the generation of the required numerical representations. Here, we introduce ProSmith, a machine learning framework that employs a multimodal Transformer Network to simultaneously process protein amino acid sequences and small molecule strings in the same input. This approach facilitates the exchange of all relevant information between the two types of molecules during the computation of their numerical representations, allowing the model to account for their structural and functional interactions. Our final model combines gradient boosting predictions based on the resulting multimodal Transformer Network with independent predictions based on separate deep learning representations of the proteins and small molecules. The resulting predictions outperform all previous models for predicting drug-target interactions, and the model demonstrates unprecedented generalization capabilities to unseen proteins. We further show that the superior performance of ProSmith is not limited to drug-target interaction predictions, but also leads to improvements in other protein-small molecule interaction prediction tasks, the prediction of Michaelis constants K M of enzyme-substrate pairs and the identification of potential substrates for enzymes. The Python code provided can be used to easily implement and improve machine learning predictions of interactions between proteins and arbitrary drug candidates or other small molecules.
科研通智能强力驱动
Strongly Powered by AbleSci AI