计算机科学
嵌入
文字嵌入
情报检索
卷积神经网络
背景(考古学)
图形
代表(政治)
构造(python库)
人工智能
数据挖掘
自然语言处理
理论计算机科学
古生物学
政治
政治学
法学
生物
程序设计语言
作者
Cheng Zhou,Bin Li,Xiaobing Sun,Shuang Yu
标识
DOI:10.1016/j.jss.2023.111617
摘要
Software bug analysis based on the information retrieval (IR) technology is widely studied and used for bug understanding, localization and fixing. IR technology with various textual feature extraction methods formulates the textual information in a given new bug report (i.e., title and description) as an initial query. However, due to the low-quality content in the new bug report and improper representation to be used as a query, the retrieval results are usually not satisfactory. To alleviate these problems, we propose a novel knowledge-aware bug report reformulation approach (a.k.a, KABR) by leveraging multi-level embeddings from the bug data. First, we construct a bug-specific knowledge graph (KG) to manage and reuse prior knowledge extracted from historical bug reports. Then, we extract word embedding from the original bug data, entity embedding and context embedding from the bug-specific KG to enhance the initial query. Finally, a new query representation is generated by leveraging multi-level embeddings through Convolutional Neural Networks (CNN) with the self-attention mechanism. We evaluate KABR based on the duplicate bug report detection task, and the experimental results show that KABR achieves 6%–11% F1-measure improvement over the state-of-the-art approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI