MASTER: Multi-Source Transfer Weighted Ensemble Learning for Multiple Sources Cross-Project Defect Prediction

计算机科学学习迁移集成学习人工智能机器学习传输（计算）数据挖掘并行计算

作者

Haonan Tong,Dalin Zhang,Jiqiang Liu,Weiwei Xing,Lingyun Lu,Wei Lu,Yumei Wu

出处

期刊：IEEE Transactions on Software Engineering [IEEE Computer Society]
日期：2024-03-25 卷期号：50 (5): 1281-1305 被引量：6

标识

DOI：10.1109/tse.2024.3381235

摘要

Background: Multi-source cross-project defect prediction (MSCPDP) attempts to transfer defect knowledge learned from multiple source projects to the target project. MSCPDP has drawn increasing attention from academic and industry communities owing to its advantages compared with single-source cross-project defect prediction (SSCPDP). However, two main problems, which are how to effectively extract the transferable knowledge from each source dataset and how to measure the amount of knowledge transferred from each source dataset to the target dataset, seriously restrict the performance of existing MSCPDP models.

Objective: In this paper, we propose a novel multi-source transfer weighted ensemble learning (MASTER) method for MSCPDP.

Method: MASTER measures the weight of each source dataset based on feature importance and distribution difference and then extracts the transferable knowledge based on the proposed feature-weighted transfer learning algorithm. Experiments are performed on 30 software projects. We compare MASTER with the latest state-of-the-art MSCPDP methods with statistical test in terms of famous effort-unaware measures (i.e., PD, PF, AUC, and MCC) and two widely used effort-aware measures (P_opt 20% and IFA).

Result: The experiment results show that: 1) MASTER can substantially improve the prediction performance compared with the baselines, e.g., an improvement of at least 49.1% in MCC, 48.1% in IFA; 2) MASTER significantly outperforms each baseline on most datasets in terms of AUC, MCC, P_opt 20% and IFA; 3) MSCPDP model significantly performs better than the mean case of SSCPDP model on most datasets and even outperforms the best case of SSCPDP on some datasets.

Conclusion: It can be concluded that 1) it is very necessary to conduct MSCPDP, and 2) the proposed MASTER is a more promising alternative for MSCPDP.

求助该文献

最长约 10秒，即可获得该文献文件

MASTER: Multi-Source Transfer Weighted Ensemble Learning for Multiple Sources Cross-Project Defect Prediction

今日热心研友