计算机科学
程序理解
源代码
程序设计语言
图形
静态程序分析
正确性
预处理器
解析
抽象语法树
抽象语法
理论计算机科学
自然语言处理
人工智能
语义学(计算机科学)
软件
软件开发
软件系统
作者
José Miguel Paiva,José Paulo Leal,Álvaro Figueira
标识
DOI:10.2298/csis230615004p
摘要
Static source code analysis techniques are gaining relevance in automated assessment of programming assignments as they can provide less rigorous evaluation and more comprehensive and formative feedback. These techniques focus on source code aspects rather than requiring effective code execution. To this end, syntactic and semantic information encoded in textual data is typically represented internally as graphs, after parsing and other preprocessing stages. Static automated assessment techniques, therefore, draw inferences from intermediate representations to determine the correctness of a solution and derive feedback. Consequently, achieving the most effective semantic graph representation of source code for the specific task is critical, impacting both techniques? accuracy, outcome, and execution time. This paper aims to provide a thorough comparison of the most widespread semantic graph representations for the automated assessment of programming assignments, including usage examples, facets, and costs for each of these representations. A benchmark has been conducted to assess their cost using the Abstract Syntax Tree (AST) as a baseline. The results demonstrate that the Code Property Graph (CPG) is the most feature-rich representation, but also the largest and most space-consuming (about 33% more than AST).
科研通智能强力驱动
Strongly Powered by AbleSci AI