计算机科学
人工智能
自然语言处理
背景(考古学)
日食
软件
人工神经网络
分类器(UML)
机器学习
程序设计语言
情报检索
天文
生物
物理
古生物学
作者
Montassar Ben Messaoud,Asma Miladi,Ilyes Jenhani,Mohamed Wiem Mkaouer,Lobna Ghadhab
出处
期刊:IEEE Transactions on Reliability
[Institute of Electrical and Electronics Engineers]
日期:2023-06-01
卷期号:72 (2): 846-858
被引量:8
标识
DOI:10.1109/tr.2022.3193645
摘要
Context: Users and developers use bug tracking systems to report errors that occur during the development and testing of software. The manual identification of duplicates is a tedious task especially with software that have large bug repositories. In this context, their automatic detection becomes a necessary task that can help prevent frequently fixing the same bug. Objective: In this article, we propose BERT-MLP , a novel pretrained language model using bidirectional encoder representations from ransformers (BERT) for duplicate bug report detection (DBRD) with the aim of improving the detection rate compared to existing works. Method: Our approach considers only unstructured data. These are fed into the BERT model in order to learn the contextual relationships between words. The output is fed into a multilayer perceptron (MLP) classifier, representing our base DBRD. Results: Our approach was evaluated on three projects: Mozilla Firefox, Eclipse Platform, and Thunderbird. It achieved an accuracy of 92.11, 94.08, and 89.03%, respectively, for Mozilla, Eclipse, and Thunderbird. A comparison with a dual-channel convolutional neural network (DC-CNN) model and other pretrained models, including RoBERTa and Sentence-Bert has been conducted. Results showed that BERT-MLP outperformed, the second best performing models (DC-CNN and Sentence-BERT) by 12% in accuracy for Eclipse and 9% for both Mozilla and Thunderbird, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI