FTRANS

计算机科学 现场可编程门阵列 变压器 计算 并行计算 循环神经网络 高效能源利用 块(置换群论) 计算机工程 人工智能 计算机体系结构 人工神经网络 算法 嵌入式系统 电压 量子力学 电气工程 物理 工程类 数学 几何学
作者
Bingbing Li,Santosh Pandey,Haowen Fang,Yanjun Lyv,Ji Li,Jieyang Chen,Mimi Xie,Lipeng Wan,Hang Liu,Caiwen Ding
标识
DOI:10.1145/3370748.3406567
摘要

In natural language processing (NLP), the "Transformer" architecture was proposed as the first transduction model replying entirely on self-attention mechanisms without using sequence-aligned recurrent neural networks (RNNs) or convolution, and it achieved significant improvements for sequence to sequence tasks. The introduced intensive computation and storage of these pre-trained language representations has impeded their popularity into computation and memory constrained devices. The field-programmable gate array (FPGA) is widely used to accelerate deep learning algorithms for its high parallelism and low latency. However, the trained models are still too large to accommodate to an FPGA fabric. In this paper, we propose an efficient acceleration framework, Ftrans, for transformer-based large scale language representations. Our framework includes enhanced block-circulant matrix (BCM)-based weight representation to enable model compression on large-scale language representations at the algorithm level with few accuracy degradation, and an acceleration design at the architecture level. Experimental results show that our proposed framework significantly reduce the model size of NLP models by up to 16 times. Our FPGA design achieves 27.07× and 81 × improvement in performance and energy efficiency compared to CPU, and up to 8.80× improvement in energy efficiency compared to GPU.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
7木喜完成签到,获得积分10
刚刚
无花果应助自知采纳,获得10
刚刚
OK佛发布了新的文献求助10
1秒前
1秒前
勤恳的猕猴桃完成签到 ,获得积分10
2秒前
暴躁的初夏完成签到 ,获得积分10
2秒前
2秒前
黄光完成签到,获得积分10
3秒前
天丽完成签到,获得积分10
4秒前
iNk应助Bismarck采纳,获得10
4秒前
科目三应助Bismarck采纳,获得10
4秒前
龙精神完成签到,获得积分10
4秒前
随性随缘随命完成签到 ,获得积分10
5秒前
6秒前
hmhu完成签到,获得积分10
8秒前
在水一方应助科研通管家采纳,获得10
8秒前
orixero应助科研通管家采纳,获得10
8秒前
咖啡豆应助科研通管家采纳,获得10
8秒前
四观人应助科研通管家采纳,获得10
8秒前
无花果应助科研通管家采纳,获得10
8秒前
CodeCraft应助科研通管家采纳,获得10
8秒前
Danish应助科研通管家采纳,获得10
8秒前
深情安青应助科研通管家采纳,获得10
8秒前
8秒前
情怀应助科研通管家采纳,获得10
8秒前
Orange应助科研通管家采纳,获得30
8秒前
汉堡包应助科研通管家采纳,获得10
8秒前
小二郎应助科研通管家采纳,获得10
9秒前
上官若男应助科研通管家采纳,获得10
9秒前
HEIKU应助科研通管家采纳,获得10
9秒前
pluto应助科研通管家采纳,获得10
9秒前
orixero应助科研通管家采纳,获得10
9秒前
充电宝应助科研通管家采纳,获得10
9秒前
科目三应助科研通管家采纳,获得10
9秒前
咖啡豆应助科研通管家采纳,获得10
9秒前
9秒前
英俊的铭应助科研通管家采纳,获得10
9秒前
赘婿应助科研通管家采纳,获得10
9秒前
Owen应助科研通管家采纳,获得10
9秒前
李健应助科研通管家采纳,获得10
9秒前
高分求助中
The Oxford Handbook of Social Cognition (Second Edition, 2024) 1050
Kinetics of the Esterification Between 2-[(4-hydroxybutoxy)carbonyl] Benzoic Acid with 1,4-Butanediol: Tetrabutyl Orthotitanate as Catalyst 1000
The Young builders of New china : the visit of the delegation of the WFDY to the Chinese People's Republic 1000
юрские динозавры восточного забайкалья 800
English Wealden Fossils 700
Handbook of Qualitative Cross-Cultural Research Methods 600
Chen Hansheng: China’s Last Romantic Revolutionary 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3139996
求助须知:如何正确求助?哪些是违规求助? 2790894
关于积分的说明 7796961
捐赠科研通 2447258
什么是DOI,文献DOI怎么找? 1301779
科研通“疑难数据库(出版商)”最低求助积分说明 626340
版权声明 601194