计算机科学
粒度
树(集合论)
树遍历
数据挖掘
软件
编码(内存)
人工智能
机器学习
水准点(测量)
决策树
算法
程序设计语言
数学分析
数学
地理
大地测量学
作者
Shaojian Qiu,Huihao Huang,Wenchao Jiang,Fanlong Zhang,Weilin Zhou
出处
期刊:IEEE transactions on sustainable computing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-12
被引量:2
标识
DOI:10.1109/tsusc.2023.3248965
摘要
Defects in software may result in system crashes, sluggish performance, or even deadlock, leading to the depletion of valuable resources. Implementing defect prediction can assist quality assurance teams in identifying potential software issues and rationalizing the allocation of testing resources, thereby decreasing the elimination of resources and enhancing software sustainability. Researchers have recently incorporated deep learning into defect prediction, extracting structural-semantic features from codes' abstract syntax trees (ASTs). However, inappropriate node granularity in ASTs may adversely impact the effectiveness of the extracted features. In addition, converting AST nodes into integer vectors may lead to the loss of structure information, resulting in poor model predictive capability. This paper proposes a tree-based encoding method with hybrid granularity for defect prediction to address these challenges. Specifically, five granularity selection schemes are extended to generate various ASTs from codes. Subsequently, a tree-based continuous bag-of-words model is utilized to map nodes of ASTs into numeric vector representations that conform to the tree-like structure of codes. The matrices converted from ASTs are then fed into a convolutional neural network to extract program features automatically. Experiments involving 24 versions of open-source projects demonstrate that our method can improve the effectiveness of extracted features in defect prediction tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI