亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Vulnerability Detection by Learning From Syntax-Based Execution Paths of Code

代码段 计算机科学 抽象语法树 编码(集合论) 脆弱性(计算) 源代码 人工智能 分类器(UML) 语法 程序设计语言 计算机安全 集合(抽象数据类型)
作者
Junwei Zhang,Zhongxin Liu,Xing Hu,Xin Xia,Shanping Li
出处
期刊:IEEE Transactions on Software Engineering [Institute of Electrical and Electronics Engineers]
卷期号:49 (8): 4196-4212 被引量:13
标识
DOI:10.1109/tse.2023.3286586
摘要

Vulnerability detection is essential to protect software systems. Various approaches based on deep learning have been proposed to learn the pattern of vulnerabilities and identify them. Although these approaches have shown vast potential in this task, they still suffer from the following issues: (1) It is difficult for them to distinguish vulnerability-related information from a large amount of irrelevant information, which hinders their effectiveness in capturing vulnerability features. (2) They are less effective in handling long code because many neural models would limit the input length, which hinders their ability to represent the long vulnerable code snippets. To mitigate these two issues, in this work, we proposed to decompose the syntax-based Control Flow Graph (CFG) of the code snippet into multiple execution paths to detect the vulnerability. Specifically, given a code snippet, we first build its CFG based on its Abstract Syntax Tree (AST), refer to such CFG as syntax-based CFG, and decompose the CFG into multiple paths from an entry node to its exit node. Next, we adopt a pre-trained code model and a convolutional neural network to learn the path representations with intra- and inter-path attention. The feature vectors of the paths are combined as the representation of the code snippet and fed into the classifier to detect the vulnerability. Decomposing the code snippet into multiple paths can filter out some redundant information unrelated to the vulnerability and help the model focus on the vulnerability features. Besides, since the decomposed paths are usually shorter than the code snippet, the information located in the tail of the long code is more likely to be processed and learned. To evaluate the effectiveness of our model, we build a dataset with over 231k code snippets, in which there are 24k vulnerabilities. Experimental results demonstrate that the proposed approach outperforms state-of-the-art baselines by at least 22.30%, 42.92%, and 32.58% in terms of Precision, Recall, and F1-Score, respectively. Our further analysis investigates the reason for the proposed approach's superiority.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ZZZFK完成签到,获得积分20
11秒前
科研通AI6.1应助Jerry采纳,获得10
13秒前
ZZZFK关注了科研通微信公众号
18秒前
uery完成签到,获得积分10
26秒前
俏皮的安萱完成签到 ,获得积分10
27秒前
小袁完成签到 ,获得积分10
33秒前
Dskelf完成签到,获得积分10
35秒前
朴素的山蝶完成签到 ,获得积分0
37秒前
39秒前
晨晨发布了新的文献求助10
48秒前
49秒前
49秒前
科研通AI2S应助科研通管家采纳,获得10
50秒前
gszy1975完成签到,获得积分10
51秒前
52秒前
Jerry发布了新的文献求助10
53秒前
joker完成签到 ,获得积分0
54秒前
紫陌完成签到 ,获得积分10
57秒前
57秒前
情怀应助ZZZFK采纳,获得10
59秒前
wuwen发布了新的文献求助10
1分钟前
上官若男应助晨晨采纳,获得10
1分钟前
隐形曼青应助silsotiscolor采纳,获得10
1分钟前
Jerry完成签到,获得积分20
1分钟前
1分钟前
1分钟前
1分钟前
cherry完成签到,获得积分20
1分钟前
1分钟前
DamienC发布了新的文献求助10
1分钟前
cherry发布了新的文献求助10
1分钟前
1分钟前
mieyy完成签到,获得积分10
1分钟前
wuwen发布了新的文献求助10
1分钟前
DamienC完成签到,获得积分10
1分钟前
爱学术的LaoD完成签到,获得积分10
1分钟前
silsotiscolor发布了新的文献求助10
1分钟前
1分钟前
张亚宁完成签到,获得积分20
1分钟前
花陵发布了新的文献求助10
1分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
Digital Twins of Advanced Materials Processing 2000
Propeller Design 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 化学工程 生物化学 物理 计算机科学 内科学 复合材料 催化作用 物理化学 光电子学 电极 冶金 细胞生物学 基因
热门帖子
关注 科研通微信公众号,转发送积分 6012424
求助须知:如何正确求助?哪些是违规求助? 7568732
关于积分的说明 16138917
捐赠科研通 5159379
什么是DOI,文献DOI怎么找? 2763054
邀请新用户注册赠送积分活动 1742261
关于科研通互助平台的介绍 1633938