DeKeDVer: A deep learning-based multi-type software vulnerability classification framework using vulnerability description and source code

计算机科学源代码脆弱性（计算）人工智能分类器（UML）机器学习脆弱性评估数据挖掘计算机安全程序设计语言心理学心理弹性心理治疗师

作者

Yukun Dong,Yeer Tang,Xiaotong Cheng,Yufei Yang

出处

期刊：Information & Software Technology [Elsevier BV]
日期：2023-11-01 卷期号：163: 107290-107290 被引量：3

标识

DOI：10.1016/j.infsof.2023.107290

摘要

Software vulnerabilities have confused software developers for a long time. Vulnerability classification is thus crucial, through which we can know the specific type of vulnerability and then conduct targeted repair. Stack of papers have looked into deep learning-based multi-type vulnerability classification, among which most are based on vulnerability descriptions and some are based on source code. While vulnerability descriptions can sometimes mislead vulnerability classification and source code-based approaches have been rarely explored in multi-type vulnerability classification. We design DeKeDVer (Vulnerability Descriptions and Key Domain based Vulnerability Classifier) with two objectives: (i) to extract more useful information from vulnerability descriptions; (ii) to better utilize the information source code can reflect. In this work, we propose a multi-type vulnerability classifier which combine vulnerability descriptions and source code together. We process vulnerability descriptions and source code of each project separately. For the vulnerability description of a sample, we preprocess it using a specified way we design based on our observations on numerous descriptions and then select text features. After that, Text Recurrent Convolutional Neural Network (TextRCNN) is applied to learn text information. For source code, we leverage its Code Property Graph (CPG) and extract key domain from it which are then embedded. Acquired feature vectors are then fed into Relational Graph Attention Network (RGAT). Result vectors gained from TextRCNN and RGAT are combined together as the feature vector of the current sample. A Multi-Layer Perceptron (MLP) layer is further added to undertake classification. We conduct our experiments on C/C++ projects from NVD. Experimental results show that our work achieves 84.49% in weighted F1-measure which proves our work to be more effective. Our work utilizes information reflected both from vulnerability descriptions and source code to facilitate vulnerability classification and achieves higher weighted F1-measure than existing vulnerability classification tools.

求助该文献

最长约 10秒，即可获得该文献文件

DeKeDVer: A deep learning-based multi-type software vulnerability classification framework using vulnerability description and source code

今日热心研友