亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product

计算机科学 嵌入式系统 德拉姆 计算机硬件 计算机体系结构
作者
Sukhan Lee,Shin-haeng Kang,Jaehoon Lee,Hyeonsu Kim,Eojin Lee,Seung-Woo Seo,Hosang Yoon,Seungwon Lee,Kyoung-Hwan Lim,Hyunsung Shin,Jin-Hyun Kim,O Seongil,Anand Iyer,David Wang,Kyomin Sohn,Nam Sung Kim
出处
期刊:International Symposium on Computer Architecture 被引量:114
标识
DOI:10.1109/isca52012.2021.00013
摘要

Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the fraction has steadily increased with the stagnant technology scaling and poor data reuse characteristics of such emerging applications. To cost-effectively increase the bandwidth and energy efficiency, researchers began to reconsider the past processing-in-memory (PIM) architectures and advance them further, especially exploiting recent integration technologies such as 2.5D/3D stacking. Albeit the recent advances, no major memory manufacturer has developed even a proof-of-concept silicon yet, not to mention a product. This is because the past PIM architectures often require changes in host processors and/or application code which memory manufacturers cannot easily govern. In this paper, elegantly tackling the aforementioned challenges, we propose an innovative yet practical PIM architecture. To demonstrate its practicality and effectiveness at the system level, we implement it with a 20nm DRAM technology, integrate it with an unmodified commercial processor, develop the necessary software stack, and run existing applications without changing their source code. Our evaluation at the system level shows that our PIM improves the performance of memory-bound neural network kernels and applications by 11.2× and 3.5×, respectively. Atop the performance improvement, PIM also reduces the energy per bit transfer by 3.5×, and the overall energy efficiency of the system running the applications by 3.2×.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
宋丽薇完成签到,获得积分10
30秒前
1分钟前
mysilicon发布了新的文献求助10
1分钟前
丁静完成签到 ,获得积分10
1分钟前
1分钟前
mysilicon关注了科研通微信公众号
2分钟前
Nan发布了新的文献求助30
2分钟前
研友_ZA2B68完成签到,获得积分10
2分钟前
3分钟前
fffccclll完成签到,获得积分10
3分钟前
4分钟前
oywt发布了新的文献求助10
4分钟前
彭于晏应助tbb采纳,获得10
4分钟前
4分钟前
CodeCraft应助科研通管家采纳,获得10
4分钟前
4分钟前
moyueeer发布了新的文献求助10
4分钟前
moyueeer完成签到 ,获得积分10
5分钟前
狄安娜GoGo发布了新的文献求助10
6分钟前
香蕉觅云应助科研通管家采纳,获得10
6分钟前
852应助科研通管家采纳,获得10
6分钟前
Aaernan完成签到 ,获得积分10
6分钟前
30完成签到,获得积分10
7分钟前
激动的似狮完成签到,获得积分10
7分钟前
7分钟前
jy发布了新的文献求助10
8分钟前
闪闪蜜粉完成签到 ,获得积分10
8分钟前
科研通AI5应助彩色傲柏采纳,获得10
8分钟前
8分钟前
彩色傲柏发布了新的文献求助10
8分钟前
8分钟前
狄安娜GoGo完成签到,获得积分10
8分钟前
tbb发布了新的文献求助10
8分钟前
jy关注了科研通微信公众号
8分钟前
霍夫曼降解完成签到,获得积分10
10分钟前
10分钟前
10分钟前
Owen应助科研通管家采纳,获得10
10分钟前
tian发布了新的文献求助10
10分钟前
激动的晓筠完成签到 ,获得积分10
10分钟前
高分求助中
All the Birds of the World 4000
Production Logging: Theoretical and Interpretive Elements 3000
Animal Physiology 2000
Les Mantodea de Guyane Insecta, Polyneoptera 2000
Am Rande der Geschichte : mein Leben in China / Ruth Weiss 1500
CENTRAL BOOKS: A BRIEF HISTORY 1939 TO 1999 by Dave Cope 1000
Machine Learning Methods in Geoscience 1000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3736630
求助须知:如何正确求助?哪些是违规求助? 3280611
关于积分的说明 10020100
捐赠科研通 2997293
什么是DOI,文献DOI怎么找? 1644517
邀请新用户注册赠送积分活动 782041
科研通“疑难数据库(出版商)”最低求助积分说明 749648