子网
计算机科学
预处理器
稳健性(进化)
姿势
卷积(计算机科学)
人工智能
特征提取
模式识别(心理学)
特征(语言学)
数据挖掘
人工神经网络
生物化学
化学
语言学
哲学
计算机安全
基因
作者
Dingning Xu,Rong Zhang,Lijun Guo,Feng Chen,Shangce Gao
标识
DOI:10.1016/j.aei.2022.101785
摘要
Lightweight implementation of existing human pose estimation networks limits the model representation capability, and it cannot effectively deal with problems such as changeable poses, complex backgrounds, and occlusion in practical applications. To address this problem, a lightweight human pose estimation network with dynamic convolution, called LDNet, is proposed in this study. First, we start from a lightweight feature extraction head to reduce the number of image preprocessing parameters. Then, we employ a high-resolution parallel subnetwork to predict precise keypoint heatmaps. To reduce the complexity due to high-resolution representations while maintaining good network performance, we propose a lightweight dynamic convolution. It can cope with changing human poses by adaptively learning different convolution parameters. Finally, to further exploit the relationship between the high-level semantic and spatial structure features for accurately locating different keypoints, we propose a keypoint refinement module based on our lightweight dynamic convolution to improve the keypoint detection and location results. Overall, accurate keypoint prediction results are obtained and compared with those of many existing networks, such as HRNet, the number of parameters is reduced by 82.1% and the calculation complexity is reduced by 47.9%. The model achieves an average precision of 73.5% and 88.7% on the COCO 2017 and MPII datasets, respectively. LDNet also shows good prediction accuracy and robustness on the CrowdPose dataset. The proposed network is superior to existing outstanding lightweight networks and is comparable to existing large-scale human pose estimation networks.
科研通智能强力驱动
Strongly Powered by AbleSci AI