计算机科学
核(代数)
卷积(计算机科学)
特征(语言学)
功能(生物学)
基质(化学分析)
计算机硬件
人工智能
算法
人工神经网络
离散数学
数学
进化生物学
生物
哲学
语言学
复合材料
材料科学
作者
Tao Zhongyu,Yuanfeng Wang,Zhang Huai-sheng
标识
DOI:10.1109/iccwamtip56608.2022.10016515
摘要
For Convolution Neural Network (CNN), the convolution operation for feature map and weight map usually implemented by im2col + GEMM method. However, for conventional method need expand feature map to a large feature matrix during a single kernel function based on convolution parameters (i.e. filter size, padding, and stride), then multiplication for matrixes took place in another function. Thus the conventional method will generate tons data transfer and the large feature matrix requires enormous storage space, it is hardware unfriendly.We design a hardware unit, I 2 CU (Im2Col Unit), a dedicated hardware unit to implement im2col in hardware friendly way. I 2 CU dynamically expand loaded 4D-Block return from texture unit and write back destination matrix to shared memory. I 2 CU can decrease the feature matrix storage space and implement im2col + GEMM in one kernel function.
科研通智能强力驱动
Strongly Powered by AbleSci AI