Objective: In this paper, we propose a novel multi-source transfer weighted ensemble learning (MASTER) method for MSCPDP.
Method: MASTER measures the weight of each source dataset based on feature importance and distribution difference and then extracts the transferable knowledge based on the proposed feature-weighted transfer learning algorithm. Experiments are performed on 30 software projects. We compare MASTER with the latest state-of-the-art MSCPDP methods with statistical test in terms of famous effort-unaware measures (i.e., PD, PF, AUC, and MCC) and two widely used effort-aware measures (Popt 20% and IFA).
Result: The experiment results show that: 1) MASTER can substantially improve the prediction performance compared with the baselines, e.g., an improvement of at least 49.1% in MCC, 48.1% in IFA; 2) MASTER significantly outperforms each baseline on most datasets in terms of AUC, MCC, Popt 20% and IFA; 3) MSCPDP model significantly performs better than the mean case of SSCPDP model on most datasets and even outperforms the best case of SSCPDP on some datasets.
Conclusion: It can be concluded that 1) it is very necessary to conduct MSCPDP, and 2) the proposed MASTER is a more promising alternative for MSCPDP.