德拉姆
卷积神经网络
数据流
高效能源利用
计算机科学
计算
并行计算
计算机体系结构
吞吐量
计算机硬件
炸薯条
嵌入式系统
人工智能
能源消耗
算法
工程类
电信
电气工程
无线
作者
Yu-Hsin Chen,Tushar Krishna,Joel Emer,Vivienne Sze
出处
期刊:IEEE Journal of Solid-state Circuits
[Institute of Electrical and Electronics Engineers]
日期:2016-11-08
卷期号:52 (1): 127-138
被引量:2608
标识
DOI:10.1109/jssc.2016.2616357
摘要
Eyeriss is an accelerator for state-of-the-art deep convolutional neural networks (CNNs). It optimizes for the energy efficiency of the entire system, including the accelerator chip and off-chip DRAM, for various CNN shapes by reconfiguring the architecture. CNNs are widely used in modern AI systems but also bring challenges on throughput and energy efficiency to the underlying hardware. This is because its computation requires a large amount of data, creating significant data movement from on-chip and off-chip that is more energy-consuming than computation. Minimizing data movement energy cost for any CNN shape, therefore, is the key to high throughput and energy efficiency. Eyeriss achieves these goals by using a proposed processing dataflow, called row stationary (RS), on a spatial architecture with 168 processing elements. RS dataflow reconfigures the computation mapping of a given shape, which optimizes energy efficiency by maximally reusing data locally to reduce expensive data movement, such as DRAM accesses. Compression and data gating are also applied to further improve energy efficiency. Eyeriss processes the convolutional layers at 35 frames/s and 0.0029 DRAM access/multiply and accumulation (MAC) for AlexNet at 278 mW (batch size N = 4), and 0.7 frames/s and 0.0035 DRAM access/MAC for VGG-16 at 236 mW (N = 3).
科研通智能强力驱动
Strongly Powered by AbleSci AI