计算机科学
域适应
光学(聚焦)
源代码
适应(眼睛)
编码(集合论)
领域(数学分析)
人工智能
机器学习
集合(抽象数据类型)
软件
数据挖掘
班级(哲学)
软件错误
无监督学习
学习迁移
标记数据
程序设计语言
数学分析
物理
数学
分类器(UML)
光学
作者
Xiaosong Huang,Youfeng Wu,Hongyi Liu,Ying Liu,Honggang Yu,Dewen Guo,Zhonghai Wu
标识
DOI:10.1109/saner56733.2023.00037
摘要
Software defect prediction can automatically locate defective code modules to focus testing resources better. Traditional defect prediction methods mainly focus on manually designing features, which are input into machine learning classifiers to identify defective code. However, there are mainly two problems in prior works. First manually designing features is time consuming and unable to capture the semantic information of programs, which is an important capability for accurate defect prediction. Second the labeled data is limited along with severe class imbalance, affecting the performance of defect prediction.In response to the above problems, we first propose a new unsupervised domain adaptation method using pseudo labels for defect prediction(UDA-DP). Compared to manually designed features, it can automatically extract defective features from source programs to save time and contain more semantic information of programs. Moreover, unsupervised domain adaptation using pseudo labels is a kind of transfer learning, which is effective in leveraging rich information of limited data, alleviating the problem of insufficient data.Experiments with 10 open source projects from the PROMISE data set show that our proposed UDA-DP method outperforms the state-of-the-art methods for both within-project and cross-project defect predictions. Our code and data are available at https://github.com/xsarvin/UDA-DP.
科研通智能强力驱动
Strongly Powered by AbleSci AI