财产(哲学)
管道(软件)
对偶(语法数字)
计算机科学
程序设计语言
语言学
哲学
认识论
作者
Li Wang,Jinsong Shao,Qiang Gong,Zeyu Yin,Yu Chen,Yajie Hao,Lei Zhang,Linlin Jiang,Min Yao,Jinlong Li,Fubo Wang
出处
期刊:Research Square - Research Square
日期:2025-04-07
标识
DOI:10.21203/rs.3.rs-6356959/v1
摘要
Abstract The imperfect modeling of ternary complexes has limited the application of computer-aided drug discovery tools in PROTAC research and development. In this study, a language model for PROTAC molecule design pipeline named LM-PROTAC was developed, which stands for language model-driven Proteolysis Targeting Chimera, by embedding a transformer-based generative model with dual constraints on structure and properties. This study started with the idea of segmentation and representation of molecules and protein. Firstly, a language model-driven affinity model for protein compounds to screen molecular fragments with high affinity for the target protein. Secondly, structural and physicochemical properties of these fragments were constrained during the generation process to meet specific scenario requirements. Finally, a two-round screening was performed on the preliminary generated molecules using a multidimensional property prediction model. This process identified a batch of PROTAC molecules capable of degrading disease-relevant target proteins. These molecules were subsequently validated through in vitro experiments, thus providing a complete solution for language model-driven PROTAC drug generation. Taking Wnt3a, a key tumor-related target, as a POI of degradation, the LM-PROTAC pipeline successfully generated effective PROTAC molecules. The molecular distribution experiments demonstrated the high similarity of the generated molecules to the original dataset, validating the generative model’s effectiveness in accurately defining chemical space. Molecular dynamics simulations confirmed the stable interactions between the PROTAC molecules and target proteins, while protein degradation experiments verified the efficacy of the generated PROTAC molecules in degrading target proteins. The entire LM-PROTAC pipeline is reusable and can generate degraders for other target proteins within 50 days, significantly improving the efficiency of drug discovery for undruggable targets.
科研通智能强力驱动
Strongly Powered by AbleSci AI