计算机科学
机器学习
软件开发
软件
软件建设
软件度量
人工智能
软件进化
软件工程
软件错误
软件大小调整
软件分析
软件系统
验证和确认
软件质量
程序设计语言
工程类
运营管理
作者
Geanderson E. dos Santos,Adriano Veloso,Eduardo Figueiredo
标识
DOI:10.1145/3555228.3555269
摘要
Software defect prediction is a subject of study involving the interplay of the software engineering and machine learning areas. The current literature proposed numerous machine learning models to predict software defects from software data, such as commits and code metrics. However, existing machine learning models are more valuable when we can understand the prediction. Otherwise, software developers cannot reason why a machine learning model made such predictions, generating many questions about the model's applicability in software projects. As explainable machine learning models for the defect prediction problem remain a recent research topic, it leaves room for exploration. In this paper, we propose a preliminary analysis of an extensive dataset to predict software defects. The dataset includes 47,618 classes from 53 open-source projects and covers 66 software features related to numerous features of the code. Therefore, we offer contributions on explaining how each selected software feature favors the prediction of software defects in Java projects. Our initial results suggest that developers should keep the values of some specific software features small to avoid software defects. We hope our approach can guide more discussions about explainable machine learning for defect prediction and its impact on software development.
科研通智能强力驱动
Strongly Powered by AbleSci AI