计算机科学
点云
瓶颈
比例(比率)
算法
核(代数)
点(几何)
计算
采样(信号处理)
并行计算
人工智能
数学
嵌入式系统
计算机视觉
离散数学
几何学
物理
滤波器(信号处理)
量子力学
作者
Meng Han,Liang Wang,Limin Xiao,Hao Zhang,Chenhao Zhang,Xiangrong Xu,Jianfeng Zhu
标识
DOI:10.1109/tcad.2023.3274922
摘要
Point clouds have been employed extensively in machine perception applications. Farthest point sampling (FPS) is a critical kernel for point cloud processing. With the rapid growth of point cloud scale, FPS introduces a large number of memory accesses, which become the bottleneck of the large-scale point cloud processing. In this article, we present QuickFPS, an architecture and algorithm co-design of FPS in large-scale point clouds. First, we systemically analyze the characteristics of FPS and put forward a bucket-based FPS algorithm. The algorithm introduces a two-level tree data structure to organize the large-scale point cloud into multiple buckets. By using two mechanisms named merged computation and implicit computation for the buckets, the external memory accesses and compute cost are significantly reduced. Then, we design an efficient domain-specific accelerator for FPS in large-scale point clouds. The accelerator takes advantage of different forms of parallelism and further improves the accelerator's efficiency. Finally, we evaluate QuickFPS with several widely used point cloud datasets, which include small-scale and large-scale point clouds (up to 120 000 points). Overall, QuickFPS achieves performance speedups of $43.4\times$ and $12.2\times$ compared to GTX 1080Ti GPU and state-of-the-art point cloud accelerator PointAcc, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI