Interpretability application of the Just-in-Time software defect prediction model

可解释性计算机科学软件错误预测建模数据挖掘软件集合（抽象数据类型）机器学习编码（集合论）粒度人工智能可靠性工程工程类操作系统程序设计语言

作者

Wei Zheng,Tianren Shen,Xiang Chen,Peiran Deng

出处

期刊：Journal of Systems and Software [Elsevier]
日期：2022-02-03 卷期号：188: 111245-111245 被引量：90

标识

DOI：10.1016/j.jss.2022.111245

摘要

Software defect prediction is one of the most active fields in software engineering. Recently, some experts have proposed the Just-in-time Defect Prediction Technology. Just-in-time Defect prediction technology has become a hot topic in defect prediction due to its directness and fine granularity. This technique can predict whether a software defect exists in every code change submitted by a developer. In addition, the method has the advantages of high speed and easy tracking. However, the biggest challenge is that the prediction accuracy of Just-in-Time software is affected by the data set category imbalance. In most cases, 20% of defects in software engineering may be in 80% of modules, and code changes that do not cause defects account for a large proportion. Therefore, there is an imbalance in the data set, that is, the imbalance between a few classes and a majority of classes, which will affect the classification prediction effect of the model. Furthermore, because most features do not result in code changes that cause defects, it is not easy to achieve the desired results in practice even though the model is highly predictive. In addition, the features of the data set contain many irrelevant features and redundant features, which are invalid data, which will increase the complexity of the prediction model and reduce the prediction efficiency. To improve the prediction efficiency of Just-in-Time defect prediction technology. We trained a just-in-time defect prediction model using six open source projects from different fields based on random forest classification. LIME Interpretability technique is used to explain the model to a certain extent. By using explicable methods to extract meaningful, relevant features, the experiment can only need 45% of the original work to explain the prediction results of the prediction model and identify critical features through explicable techniques, and only need 96% of the original work to achieve this goal, under the premise of ensuring specific prediction effects. Therefore, the application of interpretable techniques can significantly reduce the workload of developers and improve work efficiency.

求助该文献

最长约 10秒，即可获得该文献文件

Interpretability application of the Just-in-Time software defect prediction model

今日热心研友