发布文献求助

已入深夜，您辛苦了！由于当前在线用户较少，发布求助请尽量完整的填写文献信息，科研通机器人24小时在线，伴您度过漫漫科研夜！祝你早点完成任务，早点休息，好梦！

DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing

计算机科学培训（气象学）采样（信号处理）质量（理念）人工智能布线（电子设计自动化）深度学习机器学习计算机网络地理电信探测器认识论哲学气象学

作者

Conglong Li,Zhewei Yao,Xiaoxia Wu,Minjia Zhang,Connor Holmes,Cheng Li,Yuxiong He

出处

期刊：Proceedings of the ... AAAI Conference on Artificial Intelligence [Association for the Advancement of Artificial Intelligence (AAAI)]
日期：2024-03-24 卷期号：38 (16): 18490-18498 被引量：5

链接

aaai.org aaai.org arxiv.org arxiv.orgdoi.org

标识

DOI：10.1609/aaai.v38i16.29810

摘要

Recent advances on deep learning models come at the price of formidable training cost. The increasing model size is one of the root causes, but another less-emphasized fact is that data scale is actually increasing at a similar speed as model scale, and the training cost is proportional to both of them. Compared to the rapidly evolving model architecture, how to efficiently use the training data (especially for the expensive foundation model pretraining) is both less explored and difficult to realize due to the lack of a convenient framework that focus on data efficiency capabilities. To this end, we present DeepSpeed Data Efficiency, a framework that makes better use of data, increases training efficiency, and improves model quality. Specifically, we propose and combine two data efficiency techniques: efficient data sampling via a general curriculum learning library, and efficient data routing via a novel random layerwise token dropping technique. For GPT-3 1.3B language model pretraining, our work achieves 12.5x less data/time/cost ($3.7K if rent on Azure), while still maintaining 95% of model quality compared to baseline with full data and cost ($46.3K). For GPT-3 1.3B and BERT-large pretraining, our work can also achieve the same model quality with up to 2x less data/time/cost, or achieve better model quality under same data/time/cost. DeepSpeed Data Efficiency is easy to use and tune, enabling us to easily apply it and verify its benefit on additional tasks including GPT-3 MoE model pretraining and small-scale GPT-2/ViT finetuning.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: lvzhechen完成签到，获得积分10

1秒前; CodeCraft上传了应助文件

2秒前; 善学以致用上传了应助文件

3秒前; 周周发布了新的文献求助10

4秒前; 田様上传了应助文件

4秒前; 荔枝草莓酱完成签到，获得积分10

5秒前; Orange上传了应助文件

7秒前; 海韵_Tony发布了新的文献求助10

7秒前; 大模型的应助被Percy采纳，获得10

8秒前; 鲤鱼诗桃发布了新的文献求助10

9秒前; ding上传了应助文件

9秒前; 传奇3上传了应助文件

10秒前; zly完成签到，获得积分10

10秒前; 活力的以蕊完成签到，获得积分10

10秒前; tang发布了新的文献求助10

11秒前; 姜姜发布了新的文献求助10

13秒前; 华仔的应助被ms采纳，获得10

14秒前; 老演员发布了新的文献求助10

14秒前; 缥缈南露发布了新的文献求助10

14秒前; CipherSage的应助被parpate采纳，获得10

15秒前; nenoaowu发布了新的文献求助10

15秒前; 无花果的应助被科研通管家采纳，获得10

18秒前; 香蕉觅云的应助被科研通管家采纳，获得10

18秒前; 礼礼的应助被科研通管家采纳，获得30

18秒前; 搜集达人的应助被科研通管家采纳，获得10

19秒前; 礼礼的应助被科研通管家采纳，获得10

19秒前; 科研通管家关闭了土拨闹闹鼠的文献求助

19秒前; 科研通AI2S的应助被科研通管家采纳，获得10

19秒前; 科研通管家关闭了12345678的文献求助

19秒前; 华仔上传了应助文件

19秒前; 科研通管家关闭了12345678的文献求助

19秒前; 桐桐的应助被科研通管家采纳，获得10

19秒前; 科研通管家关闭了12345678的文献求助

19秒前; CipherSage上传了应助文件

19秒前; 成就的艳一发布了新的文献求助10

20秒前; 可爱的函函的应助被缥缈南露采纳，获得10

20秒前; Zsx完成签到，获得积分10

21秒前; 深情安青的应助被nenoaowu采纳，获得10

21秒前; 共享精神的应助被萧水白采纳，获得100

22秒前; 王先生账号发布了新的文献求助10

22秒前

高分求助中: Mantiden: Faszinierende Lauerjäger Faszinierende Lauerjäger Heßler, Claudia, Rud 1000; PraxisRatgeber: Mantiden: Faszinierende Lauerjäger 1000; Natural History of Mantodea 螳螂的自然史 1000; A Photographic Guide to Mantis of China 常见螳螂野外识别手册 800; Autoregulatory progressive resistance exercise: linear versus a velocity-based flexible model 500; Spatial Political Economy: Uneven Development and the Production of Nature in Chile 400; Research on managing groups and teams 300

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3330233; 求助须知：如何正确求助？哪些是违规求助？ 2959835; 关于积分的说明 8597237; 捐赠科研通 2638343; 什么是DOI，文献DOI怎么找？ 1444230; 科研通“疑难数据库（出版商）”最低求助积分说明 669078; 邀请新用户注册赠送积分活动 656624

今日热心研友

小鱼爱吃肉

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通