特征(语言学)
朴素贝叶斯分类器
模式识别(心理学)
人工智能
计算机科学
相似性(几何)
贝叶斯定理
数据挖掘
数学
支持向量机
贝叶斯概率
语言学
图像(数学)
哲学
作者
Haonan Tong,Wei Lü,Weiwei Xing,Shihai Wang
标识
DOI:10.1016/j.jss.2023.111721
摘要
Cross-project defect prediction (CPDP) aims to predict defects of target data by using prediction models trained on the source dataset. However, owing to the huge distribution difference, it is still a challenge to build high-performance CPDP models. We propose a novel high-performance CPDP method named adaptive triple feature-weighted transfer naive Bayes (ARRAY). ARRAY is characterized by feature weighted similarity, feature weighted instance weight, and the model adaptive adjustment. Experiments are performed on 34 defect datasets. We compare ARRAY with seven state-of-the-art CPDP methods in terms of area under ROC curve (AUC), F1, and Matthews correlation coefficient (MCC) with statistical testing methods. Experimental results show that: (1) on average, ARRAY separately improves MCC, AUC, and F1 over the baselines by at least 18.4%, 6.5%, and 4.5%; (2) ARRAY significantly performs better than each baseline on most datasets; (3) ARRAY significantly outperforms all baselines with non-negligible effect size according to post-hoc test. It can be concluded that: (1) the proposed feature weighted similarity, feature weighted instance weight, and the model adaptive adjustment are very helpful for improving the performance of CPDP models; (2) ARRAY is a more promising alternative for CPDP with common metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI