计算机科学
可扩展性
测距
嵌入式系统
以数据库为中心的体系结构
建筑
编译程序
概括性
内存层次结构
分布式计算
参考体系结构
计算机体系结构
软件体系结构
操作系统
软件
隐藏物
心理学
艺术
视觉艺术
电信
心理治疗师
作者
Heng Liao,Jiajin Tu,Jing Xia,Liu Hu,Zhou Xi-ping,Hao Yuan,Yuxing Hu
出处
期刊:High-Performance Computer Architecture
日期:2021-02-01
被引量:28
标识
DOI:10.1109/hpca51647.2021.00071
摘要
Deep neural networks (DNNs) have been successfully applied to a great variety of applications, ranging from small IoT devices to large scale services in a data center. In order to improve the efficiency of processing these DNN models, dedicated hardware accelerators are required for all these scenarios. Theoretically, there exists an optimized acceleration architecture for each application. However, considering the cost of chip design and corresponding tool-chain development, researchers need to trade off between efficiency and generality. In this work, we demonstrate that it is practical to use a unified architecture, called Ascend, to support those applications, ranging from IoT devices to data-center services. We provide a lot of design details to explain that the success of Ascend relies on contributions from different levels. First, heterogeneous computing units are employed to support various DNN models. And the datapath is adapted according to the requirement of computing and data access. Second, when scaling the Ascend architecture from a single core to a cluster containing thousands of cores, it involves design efforts, such as memory hierarchy and system level integration. Third, a multi-tier compiler, which provides flexible choices for developers, is the last critical piece. Experimental results show that using accelerators based on the Ascend architecture can achieve comparable or even better performance in different applications. In addition, various chips based on the Ascend architecture have been successfully commercialized. More than 100 million chips have been used in real products.
科研通智能强力驱动
Strongly Powered by AbleSci AI