计算机科学
专用集成电路
现场可编程门阵列
计算机体系结构
卷积神经网络
能源消耗
深度学习
嵌入式系统
人工智能
计算机工程
计算机硬件
生态学
生物
作者
Yunxiang Hu,Yuhao Liu,Zhuoyuan Liu
标识
DOI:10.1109/iccrd54409.2022.9730377
摘要
In recent years, artificial intelligence (AI) has been under rapid development, applied in various areas. Among a vast number of neural network (NN) models, the convolutional neural network (CNN) has a mainstream status in application such as image and sound recognition and machine decision. The convolution operation is the most complex and requires acceleration. A practical method is to optimize the architecture of the deep learning processor (DLP). The traditional CPU architecture lacks parallelism and memory bandwidth and is not suitable for CNN operations. Current researches are focused on graphic processing unit (GPU), field programmable gate array (FPGA) and application specific integrated circuit (ASIC). GPU is the maturest and the most widely applied, however it is not flexible and has high cost and energy consumption. Even though FPGA possesses high flexibility and low energy consumption, it is inferior in performance. ASIC, due to targeted design, is advanced in performance and energy consumption. However, it is highly inflexible. This article reviews the research outcomes of the three classic types of processors applied to CNN, and put forward the future research trend. In particular, this paper analyzes and compares the experimental performance of several processors of different types, and then summarizes the respective advantageous application fields. Hence, the novelty of this article is in the summary of practical DLPs, which is expected to provide helps for the AI researchers, and guide the selection of CNN-supporting hardware in industrial application.
科研通智能强力驱动
Strongly Powered by AbleSci AI