药物发现
优先次序
吞吐量
计算机科学
Boosting(机器学习)
高通量筛选
机器学习
人工智能
生物信息学
工程类
生物
电信
管理科学
无线
作者
Davide Boldini,Lukas Friedrich,Daniel Kühn,Stephan A. Sieber
出处
期刊:ACS central science
[American Chemical Society]
日期:2024-03-15
被引量:5
标识
DOI:10.1021/acscentsci.3c01517
摘要
Efficient prioritization of bioactive compounds from high throughput screening campaigns is a fundamental challenge for accelerating drug development efforts. In this study, we present the first data-driven approach to simultaneously detect assay interferents and prioritize true bioactive compounds. By analyzing the learning dynamics during training of a gradient boosting model on noisy high throughput screening data using a novel formulation of sample influence, we are able to distinguish between compounds exhibiting the desired biological response and those producing assay artifacts. Therefore, our method enables false positive and true positive detection without relying on prior screens or assay interference mechanisms, making it applicable to any high throughput screening campaign. We demonstrate that our approach consistently excludes assay interferents with different mechanisms and prioritizes biologically relevant compounds more efficiently than all tested baselines, including a retrospective case study simulating its use in a real drug discovery campaign. Finally, our tool is extremely computationally efficient, requiring less than 30 s per assay on low-resource hardware. As such, our findings show that our method is an ideal addition to existing false positive detection tools and can be used to guide further pharmacological optimization after high throughput screening campaigns.
科研通智能强力驱动
Strongly Powered by AbleSci AI