计算机科学
工作流程
块(置换群论)
推论
噪音(视频)
集合(抽象数据类型)
信息学
数据挖掘
过程(计算)
机器学习
选择(遗传算法)
人工智能
计算生物学
数据库
程序设计语言
生物
数学
工程类
电气工程
图像(数学)
几何学
作者
Chris Zhang,Mary Pitman,Anjali Dixit,Sumudu P. Leelananda,Henri Palacci,Meghan Lawler,Svetlana Belyanskaya,LaShadric Grady,Jonathan Franklin,Nicolas Tilmans,David L. Mobley
标识
DOI:10.1021/acs.jcim.3c00588
摘要
DNA-encoded libraries (DELs) provide the means to make and screen millions of diverse compounds against a target of interest in a single experiment. However, despite producing large volumes of binding data at a relatively low cost, the DEL selection process is susceptible to noise, necessitating computational follow-up to increase signal-to-noise ratios. In this work, we present a set of informatics tools to employ data from prior DEL screen(s) to gain information about which building blocks are most likely to be productive when designing new DELs for the same target. We demonstrate that similar building blocks have similar probabilities of forming compounds that bind. We then build a model from the inference that the combined behavior of individual building blocks is predictive of whether an overall compound binds. We illustrate our approach on a set of three-cycle OpenDEL libraries screened against soluble epoxide hydrolase (sEH) and report performance of more than an order of magnitude greater than random guessing on a holdout set, demonstrating that our model can serve as a baseline for comparison against other machine learning models on DEL data. Lastly, we provide a discussion on how we believe this informatics workflow could be applied to benefit researchers in their specific DEL campaigns.
科研通智能强力驱动
Strongly Powered by AbleSci AI