计算机科学
图像处理
计算机硬件
冯·诺依曼建筑
嵌入式系统
人工智能
图像(数学)
操作系统
作者
Han Xu,Zheyu Liu,Ziwei Li,Erxiang Ren,Maimaiti Nazhamati,Fei Qiao,Li Luo,Qi Wei,Xin-Jun Liu,Huazhong Yang
标识
DOI:10.1109/a-sscc53895.2021.9634759
摘要
In AIoT era, intelligent vision perception systems are widely deployed in edges. As shown in Fig. 1, due to limited energy budget, terminal devices usually adopt hierarchical processing architecture. A coarse object detection algorithm runs in always-on mode, and gets ready to trigger subsequent complex algorithms for precise recognition or segmentation. In conventional digital vision processing frameworks, light-induced photocurrents must be transformed to voltage ${\mathrm {(I_{ph}-to-V)}}$, converted to digital signals (A-to-D), transferred on-board to processors and exchanged between memory and processing elements. Smart vision chips provide promising solutions for cutting down these power overheads, such as placing analog processing circuits near the pixel array [2], customizing the analog-to-digital converter (ADC) which is capable of convolution [3] or adding processing circuits deeply into pixels to perform in-sensor current-domain MAC operations [4]. However, the photocurrent conversion ${\mathrm {(I_{ph}-to-V)}}$ circuits are still reserved in those works; besides, they could only complete 1st-layer convolution for low-level features extraction, and are unable to process subsequent layers for end-to-end perception tasks, which limits the processing capability with small CNN model. Additionally, systems that implement whole CNN algorithms are also proposed by integrating CIS with an analog processor in one chip [5] or stacking a CIS chip with a digital processor chip [6]. But power overheads on data transmission and memory access are still unsolved because these designs separate sensing and computing, and adopt conventional Von Neumann architecture with much memory access.
科研通智能强力驱动
Strongly Powered by AbleSci AI