判别式
计算机科学
RGB颜色模型
卷积(计算机科学)
串联(数学)
背景(考古学)
图形
人工智能
特征(语言学)
卷积神经网络
特征提取
模式识别(心理学)
计算机视觉
理论计算机科学
数学
人工神经网络
古生物学
语言学
哲学
组合数学
生物
作者
Jian Lü,Tingting Huang,Bo Zhao,Xiaogai Chen,Jian Zhou,Kaibing Zhang
出处
期刊:IEEE Sensors Journal
[Institute of Electrical and Electronics Engineers]
日期:2024-01-23
卷期号:24 (6): 8184-8196
标识
DOI:10.1109/jsen.2024.3354922
摘要
The crucial issue in current methods for skeleton-based action recognition: how to comprehensively capture the evolving features of global context information and temporal dynamics, and how to extract discriminative representations from skeleton joints and body parts. To address these issue, this paper proposes a dual-excitation spatial-temporal graph convolution method. The method adopts a pyramid aggregation structure formed through group convolution, resulting in a pyramid channel-split graph convolution module. The objective is to integrate context information of different scales by splitting channels, facilitating the interaction of information with different dimensions between channels, and establishing dependencies between channels. Subsequently, a motion excitation module is introduced, which activates motion-sensitive channels by grouping feature channels and calculating feature differences across the temporal dimension. This approach forces the model to focus on discriminative features with motion changes. Additionally, a dual attention mechanism is proposed to highlight key joints and body parts within the overall skeleton action sequence, leading to a more interpretable representation for diverse action sequences. On the NTU RGB+D 60 dataset, the accuracy of X-Sub and X-View reaches 91.6% and 96.9%, respectively. On the NTU RGB+D 120 dataset, the accuracy for X-Sub and X-Set is 87.5% and 88.5%, respectively, outperforming other methods and highlighting the effectiveness of the proposed approach in this study.
科研通智能强力驱动
Strongly Powered by AbleSci AI