计算机科学
并行计算
库达
巨量平行
绘图
图形处理单元
图形处理单元的通用计算
实施
并行处理
目标检测
方案(数学)
计算科学
计算机图形学(图像)
人工智能
数学分析
数学
模式识别(心理学)
程序设计语言
作者
Manato Hirabayashi,Shinpei Kato,Masato Edahiro,Kazuya Takeda,Seiichi Mita
出处
期刊:IEEE Transactions on Parallel and Distributed Systems
[Institute of Electrical and Electronics Engineers]
日期:2015-07-08
卷期号:27 (6): 1589-1602
被引量:7
标识
DOI:10.1109/tpds.2015.2453962
摘要
Object detection is a fundamental challenge facing intelligent applications. Image processing is a promising approach to this end, but its computational cost is often a significant problem. This paper presents schemes for accelerating the deformable part models (DPM) on graphics processing units (GPUs). DPM is a well-known algorithm for image-based object detection, and it achieves high detection rates at the expense of computational cost. GPUs are massively parallel compute devices designed to accelerate data-parallel compute-intensive workload. According to an analysis of execution times, approximately 98 percent of DPM code exhibits loop processing, which means that DPM could be highly parallelized by GPUs. In this paper, we implement DPM on the GPU by exploiting multiple parallelization schemes. Results of an experimental evaluation of this GPU-accelerated DPM implementation demonstrate that the best scheme of GPU implementations using an NVIDIA GPU achieves a speed up of 8.6x over a naive CPU-based implementation.
科研通智能强力驱动
Strongly Powered by AbleSci AI