发布文献求助

Large Language Models Can Self-Improve

一致性（知识库）计算机科学基本事实语言模型人工智能自然语言处理机器学习

作者

Jiaxin Huang,Shixiang Gu,Le Hou,Yuexin Wu,Xuezhi Wang,Hongkun Yu,Jiawei Han

出处

期刊：Cornell University - arXiv 日期：2022-01-01 被引量：15

链接

arxiv.org arxiv.org arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2210.11610

摘要

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate "high-confidence" rationale-augmented answers for unlabeled questions using Chain-of-Thought prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74.4%->82.1% on GSM8K, 78.2%->83.0% on DROP, 90.0%->94.4% on OpenBookQA, and 63.4%->67.9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label. We conduct ablation studies and show that fine-tuning on reasoning is critical for self-improvement.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 小马甲的应助被科研通管家采纳，获得10

刚刚; 科研通AI2S的应助被科研通管家采纳，获得10

刚刚; 赘婿的应助被科研通管家采纳，获得10

刚刚; 华仔的应助被科研通管家采纳，获得10

刚刚; 小小牛发布了新的文献求助10

刚刚; 舟行碧波上的应助被科研通管家采纳，获得10

刚刚; 科研通管家关闭了hyy的文献求助

刚刚; 科研通管家关闭了djbj2022的文献求助

刚刚; 胖头锦鲤发布了新的文献求助10

1秒前; 三重积分咖啡完成签到，获得积分10

1秒前; 北葵向暖完成签到，获得积分10

2秒前; 白日幻想家发布了新的文献求助10

2秒前; Singularity上传了应助文件

4秒前; 炙热的凌寒发布了新的文献求助20

4秒前; 完美世界的应助被YANGxuxuxu采纳，获得10

6秒前; 丘比特上传了应助文件

6秒前; 糊涂的雁易的应助被大方博涛采纳，获得10

6秒前; aaaaaa完成签到，获得积分10

7秒前; 爱听歌靳发布了新的文献求助400

8秒前; cj关闭了cj的文献求助

9秒前; HHHH发布了新的文献求助10

10秒前; 飘逸元灵完成签到，获得积分10

11秒前; akmdh完成签到，获得积分10

12秒前; 完美世界上传了应助文件

12秒前; 叫我魔王大人发布了新的文献求助10

14秒前; 我是老大上传了应助文件

16秒前; YANGxuxuxu发布了新的文献求助10

17秒前; NexusExplorer的应助被野性的南蕾采纳，获得10

17秒前; 桐桐的应助被HHHH采纳，获得10

17秒前; 大模型的应助被shawn采纳，获得10

18秒前; ZH完成签到，获得积分10

18秒前; 善学以致用的应助被拳击的小熊采纳，获得10

20秒前; 搜集达人的应助被.123666采纳，获得10

20秒前; 无鞅上传了应助文件

20秒前; 叫我魔王大人完成签到，获得积分10

20秒前; 科研通AI2.0上传了应助文件

20秒前; saluo发布了新的文献求助10

22秒前; 热心市民小红花的应助被冷静冰双采纳，获得10

23秒前; 斯文谷秋发布了新的文献求助10

23秒前; 香蕉觅云上传了应助文件

23秒前

高分求助中: rhetoric, logic and argumentation: a guide to student writers 1000; QMS18Ed2 | process management. 2nd ed 1000; One Man Talking: Selected Essays of Shao Xunmei, 1929–1939 1000; A Chronicle of Small Beer: The Memoirs of Nan Green 1000; From Rural China to the Ivy League: Reminiscences of Transformations in Modern Chinese History 900; Eric Dunning and the Sociology of Sport 850; The Cambridge Introduction to Intercultural Communication 700

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 2916547; 求助须知：如何正确求助？哪些是违规求助？ 2557126; 关于积分的说明 6916523; 捐赠科研通 2217141; 什么是DOI，文献DOI怎么找？ 1178458; 版权声明 588403; 科研通“疑难数据库（出版商）”最低求助积分说明 576742

今日热心研友

第六秒的鱼

热心市民小红花

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2024 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：826996720【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通