亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

CIMFormer: A Systolic CIM-Array-Based Transformer Accelerator With Token-Pruning-Aware Attention Reformulating and Principal Possibility Gathering

安全性令牌 变压器 校长(计算机安全) 计算机科学 收缩阵列 嵌入式系统 工程类 计算机安全 电气工程 电压 超大规模集成
作者
Ruiqi Guo,X.L. Chen,Lei Wang,Yang Wang,Hao Sun,Jingchuan Wei,Huiming Han,Leibo Liu,Shaojun Wei,Yang Hu,Shouyi Yin
出处
期刊:IEEE Journal of Solid-state Circuits [Institute of Electrical and Electronics Engineers]
卷期号:: 1-13
标识
DOI:10.1109/jssc.2024.3402174
摘要

Transformer models have achieved impressive performance in various artificial intelligence (AI) applications. However, the high cost of computation and memory footprint make its inference inefficient. Although digital compute-in-memory (CIM) is a promising hardware architecture with high accuracy, Transformer's attention mechanism raises three challenges in the access and computation of CIM: 1) the attention computation involving Query and Key results in massive data movement and under-utilization in CIM macros; 2) the attention computation involving Possibility and Value exhibits plenty of dynamic bit-level sparsity, resulting in redundant bit-serial CIM operations; and 3) the restricted data reload bandwidth in CIM macros results in a significant decrease in performance for large Transformer models. To address these challenges, we design a CIM accelerator called CIM Transformer (CIMFormer) with three corresponding features. First, the token-pruning-aware attention reformulation (TPAR) is a technique that adjusts attention computations according to the token-pruning ratio. This reformulation reduces the real-time access to and under-utilization of CIM macros. Second, the principal possibility gather-scatter scheduler (PPGSS) gathers the possibilities with greater effective bit-width as concurrent inputs to CIM macros, enhancing the efficiency of bit-serial CIM operations. Third, the systolic X $\mid$ W-CIM macro array efficiently handles the execution of large Transformer models that exceed the storage capacity of the on-chip CIM macros. Fabricated in a 28-nm technology, CIMFormer achieves a peak energy efficiency of 15.71 TOPS/W, with an over 1.46 $\times$ improvement compared with the state-of-the-art Transformer accelerator at an equivalent situation.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
水刃木发布了新的文献求助10
刚刚
hyukoh完成签到,获得积分20
5秒前
李爱国应助科研通管家采纳,获得10
11秒前
Ava应助科研通管家采纳,获得10
11秒前
英姑应助科研通管家采纳,获得10
11秒前
11秒前
luoxing发布了新的文献求助10
14秒前
jjj完成签到 ,获得积分10
16秒前
20秒前
26秒前
30秒前
无花果应助Heng采纳,获得10
44秒前
49秒前
50秒前
水刃木完成签到,获得积分10
52秒前
wzz完成签到,获得积分10
53秒前
55秒前
玩命的糖豆完成签到 ,获得积分10
1分钟前
1分钟前
1分钟前
1分钟前
星际舟完成签到,获得积分10
1分钟前
冰渊悬月完成签到,获得积分10
1分钟前
fsznc完成签到 ,获得积分0
1分钟前
Dritsw应助ST采纳,获得10
1分钟前
深情安青应助冰渊悬月采纳,获得10
1分钟前
1分钟前
不安的鸡翅完成签到,获得积分10
1分钟前
ZBQ发布了新的文献求助10
1分钟前
gezid完成签到 ,获得积分10
1分钟前
科研fw完成签到 ,获得积分10
1分钟前
jessie完成签到 ,获得积分10
1分钟前
coolkid应助zzz采纳,获得10
1分钟前
msk完成签到 ,获得积分10
1分钟前
zhiweiyan完成签到,获得积分10
1分钟前
1分钟前
YMS_DAMAOMI发布了新的文献求助10
2分钟前
2分钟前
ZBQ完成签到,获得积分10
2分钟前
善学以致用应助皮崇知采纳,获得10
2分钟前
高分求助中
Ophthalmic Equipment Market by Devices(surgical: vitreorentinal,IOLs,OVDs,contact lens,RGP lens,backflush,diagnostic&monitoring:OCT,actorefractor,keratometer,tonometer,ophthalmoscpe,OVD), End User,Buying Criteria-Global Forecast to2029 2000
A new approach to the extrapolation of accelerated life test data 1000
Cognitive Neuroscience: The Biology of the Mind 1000
Technical Brochure TB 814: LPIT applications in HV gas insulated switchgear 1000
Immigrant Incorporation in East Asian Democracies 500
Nucleophilic substitution in azasydnone-modified dinitroanisoles 500
不知道标题是什么 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3965622
求助须知:如何正确求助?哪些是违规求助? 3510843
关于积分的说明 11155441
捐赠科研通 3245347
什么是DOI,文献DOI怎么找? 1792840
邀请新用户注册赠送积分活动 874118
科研通“疑难数据库(出版商)”最低求助积分说明 804188