人工智能
计算机科学
骨架(计算机编程)
判别式
模式识别(心理学)
规范化(社会学)
鉴别器
人体骨骼
机器学习
计算机视觉
电信
社会学
探测器
人类学
程序设计语言
作者
Qingzhe Pan,Zhifu Zhao,Xuemei Xie,Jianan Li,Yuhan Cao,Guangming Shi
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2023-12-01
卷期号:33 (12): 7398-7412
被引量:3
标识
DOI:10.1109/tcsvt.2022.3219864
摘要
Skeleton-based action recognition has attracted great interest in computer vision. For this task, a challenging problem concerns the large intraclass variances of skeleton data, which are mainly caused by diverse viewpoints and subjects, and greatly increase the difficulty of modeling actions through a network. To address the above problem, we propose a variance reduction (VaRe) framework for skeleton-based action recognition, which consists of a view-normalization generative adversarial network (VN-GAN), a subject-independent network (SINet) and a classification network. First, the VN-GAN is responsible for reducing view-induced intraclass variances. Specifically, this network, comprising a generator and a discriminator, is aimed at learning a mapping from a diverse-view skeleton distribution to a unified-view skeleton distribution in an unsupervised manner, thereby generating a view-normalized skeleton. Second, taking the view-normalized skeleton as input, the SINet focuses on reducing the influences of the personal habits of subjects on action recognition. To generate SI skeleton data, the SINet automatically adjusts the human pose according to the human kinematic structure under a classification loss constraint. Finally, without the interference of view- and subject-induced variances, the classification network can concentrate more on learning discriminative action features to predict classes. Furthermore, by combining the joint and bone modalities, the proposed framework achieves competitive performance on three benchmarks: NTU RGB+D, NTU-120 RGB+D and Northwestern-UCLA Multiview Action 3D.
科研通智能强力驱动
Strongly Powered by AbleSci AI