计算机科学
源代码
自然语言处理
语法
人工智能
图形
抽象语法
抽象语法树
程序设计语言
控制流程图
深度学习
语义学(计算机科学)
理论计算机科学
作者
Ahmed Abdu,Zhengjun Zhai,Hakim A. Abdo,Redhwan Algabri
出处
期刊:IEEE Transactions on Reliability
[Institute of Electrical and Electronics Engineers]
日期:2024-02-01
卷期号:73 (2): 820-834
被引量:9
标识
DOI:10.1109/tr.2024.3354965
摘要
Software defect prediction approaches play an essential role in the software development life cycle to help developers predict defects early, thus, preventing wasted time and effort. Defect prediction techniques based on semantic features have recently gained success over approaches based on traditional features. Existing semantic features-based defect prediction approaches use a single source code representation. Most studies focus on contextual syntax represented by abstract syntax trees, and some studies use a control flow graph to represent code graphs. However, a single representation is still limited for predicting defects that call multiple functions and have a high probability of false positives. To close the gap between source code representations on software defect prediction, we propose a defect prediction model based on multiple source code representations. The proposed model is a deep hierarchical convolutional neural network (DH-CNN). The syntax features extracted from abstract syntax trees using Word2vec are fed into syntax-level DH-CNN, and the semantic-graph features extracted from the control flow graph and data dependence graph using Node2vec are fed into semantic-level DH-CNN. In addition, the proposed model includes a gated merging mechanism that combines DH-CNN outputs to estimate the combination ratio of both types of features. Experimental results indicate that DH-CNN outperforms existing methods under cross-project and within-project scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI