深度学习
计算机科学
人工智能
移动设备
图形处理单元的通用计算
目标检测
库达
帕斯卡(单位)
图形处理单元
嵌入式系统
推论
背景(考古学)
计算机体系结构
绘图
操作系统
模式识别(心理学)
程序设计语言
古生物学
生物
作者
Dong-Jin Shin,Jeong-Joon Kim
出处
期刊:Applied sciences
[Multidisciplinary Digital Publishing Institute]
日期:2022-04-07
卷期号:12 (8): 3734-3734
被引量:2
摘要
Deep learning-based object detection technology can efficiently infer results by utilizing graphics processing units (GPU). However, when using general deep learning frameworks in embedded systems and mobile devices, processing functionality is limited. This allows deep learning frameworks such as TensorFlow-Lite (TF-Lite) and TensorRT (TRT) to be optimized for different hardware. Therefore, this paper introduces a performance inference method that fuses the Jetson monitoring tool with TensorFlow and TRT source code on the Nvidia Jetson AGX Xavier platform. In addition, central processing unit (CPU) utilization, GPU utilization, object accuracy, latency, and power consumption of the deep learning framework were compared and analyzed. The model is You Look Only Once Version4 (YOLOv4), and the dataset uses Common Objects in Context (COCO) and PASCAL Visual Object Classes (VOC). We confirmed that using TensorFlow results in high latency. We also confirmed that TensorFlow-TensorRT (TF-TRT) and TRT using Tensor Cores provide the most efficiency. However, it was confirmed that TF-Lite showed the lowest performance because it utilizes a GPU limited to mobile devices. Through this paper, we think that when developing deep learning-related object detection technology on the Nvidia Jetson platform or desktop environment, services and research can be efficiently conducted through measurement results.
科研通智能强力驱动
Strongly Powered by AbleSci AI