化学
仿形(计算机编程)
激酶
计算生物学
生物化学
计算机科学
生物
操作系统
作者
Benjamin Merget,Samo Turk,Sameh Eid,Friedrich Rippmann,Simone Fulle
标识
DOI:10.1021/acs.jmedchem.6b01611
摘要
Kinome-wide screening would have the advantage of providing structure-activity relationships against hundreds of targets simultaneously. Here, we report the generation of ligand-based activity prediction models for over 280 kinases by employing Machine Learning methods on an extensive data set of proprietary bioactivity data combined with open data. High quality (AUC > 0.7) was achieved for ∼200 kinases by (1) combining open with proprietary data, (2) choosing Random Forest over alternative tested Machine Learning methods, and (3) balancing the training data sets. Tests on left-out and external data indicate a high value for virtual screening projects. Importantly, the derived models are evenly distributed across the kinome tree, allowing reliable profiling prediction for all kinase branches. The prediction quality was further improved by employing experimental bioactivity fingerprints of a small kinase subset. Overall, the generated models can support various hit identification tasks, including virtual screening, compound repurposing, and the detection of potential off-targets.
科研通智能强力驱动
Strongly Powered by AbleSci AI