计算机科学
混淆矩阵
强化学习
人工智能
功能(生物学)
机器学习
过程(计算)
生物
进化生物学
操作系统
作者
Yadong Wang,Yanlin Jia,Yuhang Tian,Jin Xiao
标识
DOI:10.1016/j.eswa.2022.117013
摘要
Customer credit scoring is a dynamic interactive process. Simply designing the static reward function for deep reinforcement learning may be difficult to guide an agent to adapt to the change of the customer credit scoring environment. To solve this problem, we propose the deep Q-network with the confusion-matrix-based dynamic reward function (DQN-CMDRF) model. Especially, the new constructed dynamic reward function can adjust the reward dynamically according to the change of confusion matrix after each deep Q-network model training, which can guide the agent to adapt to the change of environment quickly, so as to improve the customer credit scoring performance of the deep Q-network model. First, we formulate customer credit scoring as a finite Markov decision process. Second, to adjust the reward dynamically according to the customer credit scoring environment, the dynamic reward function is designed based on the confusion matrix. Finally, we introduce the confusion-matrix-based dynamic reward function into the deep Q-network model for customer credit scoring. To verify the effectiveness of the proposed model, we introduce four evaluation measures and make a series of experiments on the five customer credit scoring datasets. The experimental results show that the constructed dynamic reward function can more effectively improve customer credit scoring performance of the deep Q-network model, and the performance of the DQN-CMDRF model is significantly better than that of the other eight traditional classification models. More importantly, we find that the constructed dynamic reward function can accelerate the convergence speed and improve the stability of the deep Q-network model.
科研通智能强力驱动
Strongly Powered by AbleSci AI