计算机科学
现场可编程门阵列
循环展开
卷积神经网络
Virtex公司
分割
计算
硬件加速
吞吐量
量化(信号处理)
人工智能
计算机硬件
计算机工程
深度学习
实时计算
嵌入式系统
计算机视觉
算法
编译程序
无线
程序设计语言
电信
作者
Duc Khai Lam,Chenying Du,Hoai Luan Pham
出处
期刊:Sensors
[MDPI AG]
日期:2023-07-25
卷期号:23 (15): 6661-6661
摘要
Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two main problems. First, most early studies usually follow a segmentation approach, which requires much post-processing to extract the necessary geometric information about the lane lines. Second, many models fail to reach real-time speed due to the high complexity of model architecture. To offer a solution to these problems, this paper proposes a lightweight convolutional neural network that requires only two small arrays for minimum post-processing, instead of segmentation maps for the task of lane detection. This proposed network utilizes a simple lane representation format for its output. The proposed model can achieve 93.53% accuracy on the TuSimple dataset. A hardware accelerator is proposed and implemented on the Virtex-7 VC707 FPGA platform to optimize processing time and power consumption. Several techniques, including data quantization to reduce data width down to 8-bit, exploring various loop-unrolling strategies for different convolution layers, and pipelined computation across layers, are optimized in the proposed hardware accelerator architecture. This implementation can process at 640 FPS while consuming only 10.309 W, equating to a computation throughput of 345.6 GOPS and energy efficiency of 33.52 GOPS/W.
科研通智能强力驱动
Strongly Powered by AbleSci AI