No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation

正确性 可读性 单元测试 计算机科学 可用性 质量(理念) 考试(生物学) 代码覆盖率 发电机(电路理论) 测试用例 关键字驱动测试 可靠性工程 机器学习 软件工程 程序设计语言 软件 人机交互 工程类 软件开发 物理 古生物学 软件建设 哲学 功率(物理) 回归分析 认识论 量子力学 生物
作者
Zhiqiang Yuan,Yiling Lou,Mingwei Liu,Shiji Ding,Kaixin Wang,Yixuan Chen,Xin Peng
出处
期刊:Cornell University - arXiv 被引量:36
标识
DOI:10.48550/arxiv.2305.04207
摘要

Unit testing is essential in detecting bugs in functionally-discrete program units. Manually writing high-quality unit tests is time-consuming and laborious. Although traditional techniques can generate tests with reasonable coverage, they exhibit low readability and cannot be directly adopted by developers. Recent work has shown the large potential of large language models (LLMs) in unit test generation, which can generate more human-like and meaningful test code. ChatGPT, the latest LLM incorporating instruction tuning and reinforcement learning, has performed well in various domains. However, It remains unclear how effective ChatGPT is in unit test generation. In this work, we perform the first empirical study to evaluate ChatGPT's capability of unit test generation. Specifically, we conduct a quantitative analysis and a user study to systematically investigate the quality of its generated tests regarding the correctness, sufficiency, readability, and usability. The tests generated by ChatGPT still suffer from correctness issues, including diverse compilation errors and execution failures. Still, the passing tests generated by ChatGPT resemble manually-written tests by achieving comparable coverage, readability, and even sometimes developers' preference. Our findings indicate that generating unit tests with ChatGPT could be very promising if the correctness of its generated tests could be further improved. Inspired by our findings above, we propose ChatTESTER, a novel ChatGPT-based unit test generation approach, which leverages ChatGPT itself to improve the quality of its generated tests. ChatTESTER incorporates an initial test generator and an iterative test refiner. Our evaluation demonstrates the effectiveness of ChatTESTER by generating 34.3% more compilable tests and 18.7% more tests with correct assertions than the default ChatGPT.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
赘婿应助无私匕采纳,获得10
刚刚
张小小完成签到,获得积分20
刚刚
刚刚
大模型应助微笑襄采纳,获得10
刚刚
香蕉觅云应助SY采纳,获得10
1秒前
脑洞疼应助chuenpo采纳,获得10
1秒前
科研通AI6.1应助小学生采纳,获得10
1秒前
酷波er应助不去的新采纳,获得10
2秒前
2秒前
3秒前
shou85完成签到,获得积分10
3秒前
4秒前
5秒前
DarkBen发布了新的文献求助10
5秒前
JamesPei应助Vv采纳,获得10
6秒前
今后应助andykhoo2007采纳,获得10
7秒前
baifeicao完成签到,获得积分10
7秒前
wcy发布了新的文献求助10
7秒前
cc完成签到,获得积分20
8秒前
内向访旋关注了科研通微信公众号
8秒前
9秒前
9秒前
荣哥儿发布了新的文献求助10
10秒前
漂泊发布了新的文献求助10
10秒前
微笑襄发布了新的文献求助10
11秒前
11秒前
青竹发布了新的文献求助10
12秒前
hiha完成签到,获得积分0
12秒前
shisui发布了新的文献求助100
14秒前
完美世界应助Li采纳,获得10
14秒前
在水一方应助吃肉璇璇采纳,获得10
14秒前
JamesPei应助小学生采纳,获得10
15秒前
15秒前
15秒前
乐乐应助科研通管家采纳,获得10
15秒前
蓝天应助科研通管家采纳,获得10
15秒前
Hello应助科研通管家采纳,获得10
15秒前
丘比特应助科研通管家采纳,获得10
15秒前
15秒前
zhonglv7应助科研通管家采纳,获得10
15秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Handbook of pharmaceutical excipients, Ninth edition 5000
Aerospace Standards Index - 2026 ASIN2026 2000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Social Cognition: Understanding People and Events 1000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6031640
求助须知:如何正确求助?哪些是违规求助? 7715013
关于积分的说明 16197750
捐赠科研通 5178512
什么是DOI,文献DOI怎么找? 2771336
邀请新用户注册赠送积分活动 1754620
关于科研通互助平台的介绍 1639712