计算机科学
神经形态工程学
机器人
分布式计算
上下文切换
强化学习
机器人学
设计空间探索
异步通信
计算机体系结构
人工智能
嵌入式系统
人工神经网络
计算机网络
作者
Songchen Ma,Jing Pei,Weihao Zhang,Guanrui Wang,Dahu Feng,Fangwen Yu,Chenhang Song,Huanyu Qu,Cheng Ma,Mingsheng Lu,Faqiang Liu,Wenhao Zhou,Yujie Wu,Yihan Lin,Hongyi Li,Taoyi Wang,Jiuru Song,Xue Liu,Guoqi Li,Rong Zhao,Luping Shi
出处
期刊:Science robotics
[American Association for the Advancement of Science (AAAS)]
日期:2022-06-15
卷期号:7 (67)
被引量:33
标识
DOI:10.1126/scirobotics.abk2948
摘要
Recent advances in artificial intelligence have enhanced the abilities of mobile robots in dealing with complex and dynamic scenarios. However, to enable computationally intensive algorithms to be executed locally in multitask robots with low latency and high efficiency, innovations in computing hardware are required. Here, we report TianjicX, a neuromorphic computing hardware that can support true concurrent execution of multiple cross-computing-paradigm neural network (NN) models with various coordination manners for robotics. With spatiotemporal elasticity, TianjicX can support adaptive allocation of computing resources and scheduling of execution time for each task. Key to this approach is a high-level model, “Rivulet,” which bridges the gap between robotic-level requirements and hardware implementations. It abstracts the execution of NN tasks through distribution of static data and streaming of dynamic data to form the basic activity context, adopts time and space slices to achieve elastic resource allocation for each activity, and performs configurable hybrid synchronous-asynchronous grouping. Thereby, Rivulet is capable of supporting independent and interactive execution. Building on Rivulet with hardware design for realizing spatiotemporal elasticity, a 28-nanometer TianjicX neuromorphic chip with event-driven, high parallelism, low latency, and low power was developed. Using a single TianjicX chip and a specially developed compiler stack, we built a multi-intelligent-tasking mobile robot, Tianjicat, to perform a cat-and-mouse game. Multiple tasks, including sound recognition and tracking, object recognition, obstacle avoidance, and decision-making, can be concurrently executed. Compared with NVIDIA Jetson TX2, latency is substantially reduced by 79.09 times, and dynamic power is reduced by 50.66%.