Retrieval-Based Prompt Selection for Code-Related Few-Shot Learning

计算机科学 断言 任务(项目管理) 编码(集合论) 嵌入 人工智能 选择(遗传算法) 语言模型 源代码 自然语言处理 程序设计语言 机器学习 管理 集合(抽象数据类型) 经济
作者
Noor Nashid,Mifta Sintaha,Ali Mesbah
标识
DOI:10.1109/icse48619.2023.00205
摘要

Large language models trained on massive code corpora can generalize to new tasks without the need for task-specific fine-tuning. In few-shot learning, these models take as input a prompt, composed of natural language instructions, a few instances of task demonstration, and a query and generate an output. However, the creation of an effective prompt for code-related tasks in few-shot learning has received little attention. We present a technique for prompt creation that automatically retrieves code demonstrations similar to the developer task, based on embedding or frequency analysis. We apply our approach, Cedar, to two different programming languages, statically and dynamically typed, and two different tasks, namely, test assertion generation and program repair. For each task, we compare Cedar with state-of-the-art task-specific and fine-tuned models. The empirical results show that, with only a few relevant code demonstrations, our prompt creation technique is effective in both tasks with an accuracy of 76% and 52% for exact matches in test assertion generation and program repair tasks, respectively. For assertion generation, Cedar outperforms existing task-specific and fine-tuned models by 333% and 11%, respectively. For program repair, Cedar yields 189% better accuracy than task-specific models and is competitive with recent fine-tuned models. These findings have practical implications for practitioners, as Cedar could potentially be applied to multilingual and multitask settings without task or language-specific training with minimal examples and effort.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
黎明发布了新的文献求助10
1秒前
jufefit完成签到,获得积分10
1秒前
淡定苗条完成签到,获得积分10
1秒前
1秒前
无花果应助Chen采纳,获得10
2秒前
DT发布了新的文献求助10
3秒前
YJJ发布了新的文献求助10
3秒前
天天快乐应助山椒采纳,获得10
4秒前
淡定苗条发布了新的文献求助10
5秒前
5秒前
痴情的明辉完成签到 ,获得积分10
5秒前
嘿嘿嘿发布了新的文献求助10
6秒前
6秒前
8秒前
8秒前
黄芪发布了新的文献求助10
8秒前
hhh完成签到,获得积分10
8秒前
狼啸天应助Lojong采纳,获得10
10秒前
10秒前
王小思完成签到,获得积分10
12秒前
12秒前
13秒前
科研通AI5应助YJJ采纳,获得30
13秒前
咕噜咕噜完成签到,获得积分10
14秒前
推土机爱学习完成签到 ,获得积分10
14秒前
15秒前
李伟发布了新的文献求助10
15秒前
thunder完成签到,获得积分10
16秒前
16秒前
nbzhan发布了新的文献求助10
16秒前
16秒前
16秒前
情怀应助shiyi采纳,获得10
18秒前
19秒前
狼啸天应助chuchu采纳,获得10
20秒前
alulu发布了新的文献求助10
20秒前
紫津发布了新的文献求助10
20秒前
22秒前
22秒前
22秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Mechanistic Modeling of Gas-Liquid Two-Phase Flow in Pipes 2500
Structural Load Modelling and Combination for Performance and Safety Evaluation 1000
Conference Record, IAS Annual Meeting 1977 720
電気学会論文誌D(産業応用部門誌), 141 巻, 11 号 510
Typology of Conditional Constructions 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3565922
求助须知:如何正确求助?哪些是违规求助? 3138683
关于积分的说明 9428454
捐赠科研通 2839408
什么是DOI,文献DOI怎么找? 1560695
邀请新用户注册赠送积分活动 729854
科研通“疑难数据库(出版商)”最低求助积分说明 717669