概化理论
虚假关系
一般化
计算机科学
机器学习
集合(抽象数据类型)
数据挖掘
人工智能
数据集
药物靶点
数学
统计
化学
生物化学
数学分析
程序设计语言
作者
Rıza Özçelik,Alperen Bağ,Berk Atil,Melih Barsbey,Arzucan Özgür,Elif Özkırımlı
标识
DOI:10.1089/cmb.2023.0208
摘要
Statistical models that accurately predict the binding affinity of an input ligand-protein pair can greatly accelerate drug discovery. Such models are trained on available ligand-protein interaction data sets, which may contain biases that lead the predictor models to learn data set-specific, spurious patterns instead of generalizable relationships. This leads the prediction performances of these models to drop dramatically for previously unseen biomolecules. Various approaches that aim to improve model generalizability either have limited applicability or introduce the risk of degrading overall prediction performance. In this article, we present DebiasedDTA, a novel training framework for drug-target affinity (DTA) prediction models that addresses data set biases to improve the generalizability of such models. DebiasedDTA relies on reweighting the training samples to achieve robust generalization, and is thus applicable to most DTA prediction models. Extensive experiments with different biomolecule representations, model architectures, and data sets demonstrate that DebiasedDTA achieves improved generalizability in predicting drug-target affinities.
科研通智能强力驱动
Strongly Powered by AbleSci AI