计算机科学
人工智能
安全性令牌
模式识别(心理学)
嵌入
姿势
依赖关系(UML)
变压器
计算机视觉
特征提取
卷积神经网络
特征(语言学)
物理
哲学
量子力学
语言学
电压
计算机安全
作者
Biao Guo,K Liu,Qian He
标识
DOI:10.1007/978-3-031-15919-0_48
摘要
Although existing methods have made great progress in human pose estimation, there are still a lot of challenging situations not well-handled, such as occluded limbs, invisible body parts or complex scenarios. In this work, we propose a novel approach called MLPPose, which combining the MLP-Mixer layers with the convolutional token embedding for human pose estimation. The MLP-Mixer layers are consisted of two types of MLP blocks, one concerns the global receptive field and the other mixes the channel feature at each location. This composition can not only obtain the association between different keypoints, but also efficiently capture the global dependency relationships between keypoints and scenes. Thus, it allows our model to efficiently locate the keypoints, despite that some of them are occluded, invisible or in complex scenarios. Meanwhile, it is able to simplify the progress of extracting the global dependency relationships compared to the attentional mechanism which is widely used in transformer models. Experiments show that our model achieves competitive results with state-of-the-art methods on the MS-COCO and MPII human pose estimation benchmarks. Moreover, our model is more lightweight and faster than other best performance methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI