发布文献求助

已入深夜，您辛苦了！由于当前在线用户较少，发布求助请尽量完整的填写文献信息，科研通机器人24小时在线，伴您度过漫漫科研夜！祝你早点完成任务，早点休息，好梦！

Visual Instruction Tuning

计算机科学编码器编码（集合论）领域（数学）语言模型人工智能自然语言处理程序设计语言数学集合（抽象数据类型）纯数学操作系统

作者

Haotian Liu,Chunyuan Li,Qingyang Wu,Yong Jae Lee

出处

期刊：Cornell University - arXiv 日期：2023-01-01 被引量：307

链接

arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2304.08485

摘要

Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks, but the idea is less explored in the multimodal field. In this paper, we present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.Our early experiments show that LLaVA demonstrates impressive multimodel chat abilities, sometimes exhibiting the behaviors of multimodal GPT-4 on unseen images/instructions, and yields a 85.1% relative score compared with GPT-4 on a synthetic multimodal instruction-following dataset. When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. We make GPT-4 generated visual instruction tuning data, our model and code base publicly available.

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: LingYing完成签到，获得积分10

3秒前; 聪慧的月饼完成签到，获得积分10

3秒前; 科研通AI2.0上传了应助文件

4秒前; 科研通AI2S上传了应助文件

4秒前; 尼莫完成签到，获得积分10

5秒前; 李健上传了应助文件

7秒前; Owen的应助被罗诗薇采纳，获得10

7秒前; you完成签到，获得积分10

9秒前; NexusExplorer的应助被冷静初蓝采纳，获得10

10秒前; 凶狠的树叶完成签到，获得积分10

11秒前; 燕不归发布了新的文献求助10

11秒前; romme发布了新的文献求助10

12秒前; 今天没烦恼完成签到，获得积分10

13秒前; Kk完成签到，获得积分10

13秒前; 荒天帝石昊完成签到，获得积分10

14秒前; 青黛发布了新的文献求助10

15秒前; 星辰大海的应助被zstyry9998采纳，获得10

15秒前; qian72133发布了新的文献求助10

15秒前; 汉堡包上传了应助文件

17秒前; 好好完成签到，获得积分10

18秒前; 科研通AI2S上传了应助文件

19秒前; 科研通AI2S的应助被顺顺利利采纳，获得10

19秒前; 星辰大海上传了应助文件

20秒前; 小茹完成签到，获得积分10

22秒前; 科研通AI2.0上传了应助文件

23秒前; 善学以致用上传了应助文件

23秒前; 流枫发布了新的文献求助10

24秒前; 凌云客发布了新的文献求助10

25秒前; zstyry9998发布了新的文献求助10

26秒前; ccd发布了新的文献求助10

28秒前; 优雅苑睐完成签到，获得积分10

29秒前; 蔺天宇完成签到，获得积分10

30秒前; Aaaaaa瘾完成签到，获得积分10

32秒前; 优美的山晴完成签到，获得积分10

36秒前; 科研通AI2S上传了应助文件

37秒前; 慕青的应助被凌云客采纳，获得10

40秒前; 优美的山晴发布了新的文献求助10

41秒前; 青黛发布了新的文献求助10

43秒前; 苏苏完成签到，获得积分10

45秒前; 李瑞卿完成签到，获得积分10

45秒前

高分求助中: Sustainability in Tides Chemistry 2800; The Young builders of New china : the visit of the delegation of the WFDY to the Chinese People's Republic 1000; Rechtsphilosophie 1000; Bayesian Models of Cognition:Reverse Engineering the Mind 888; Le dégorgement réflexe des Acridiens 800; Defense against predation 800; XAFS for Everyone （2nd Edition） 600

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3133675; 求助须知：如何正确求助？哪些是违规求助？ 2784676; 关于积分的说明 7768124; 捐赠科研通 2439923; 什么是DOI，文献DOI怎么找？ 1297102; 科研通“疑难数据库（出版商）”最低求助积分说明 624868; 版权声明 600791

今日热心研友

坚强的广山

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：826996720【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通