吞吐量
现场可编程门阵列
卷积(计算机科学)
计算机科学
并行计算
超大规模集成
有限冲激响应
计算科学
计算机硬件
计算机体系结构
算法
电子工程
嵌入式系统
人工智能
工程类
电信
人工神经网络
无线
作者
Lenos Ioannou,Abdullah Al-Dujaili,Suhaib A. Fahmy
标识
DOI:10.1109/tvlsi.2020.2987202
摘要
Digital signal processing (DSP) on field-programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30-60 FPS, while maintaining functional flexibility.
科研通智能强力驱动
Strongly Powered by AbleSci AI