计算机科学
规范化(社会学)
模式识别(心理学)
动作识别
人工智能
核(代数)
图形
骨架(计算机编程)
卷积神经网络
卷积(计算机科学)
理论计算机科学
人工神经网络
数学
组合数学
社会学
人类学
程序设计语言
班级(哲学)
作者
Haoyu Tian,Xin Ma,Xiang Li,Yibin Li
标识
DOI:10.1109/tmm.2023.3318325
摘要
Skeleton-based action recognition has been substantially driven by the development of artificial intelligence technology and deep sensors. Recently, graph convolutional networks (GCNs) have achieved excellent performances in skeleton-based action recognition. However, the performances of GCN-based methods are impaired by inappropriate node partitioning strategy and obstructed long-range information flow. To solve these issues, a novel Select-Assemble-Normalize Graph Convolution Network (SAN-GCN) is proposed to model the spatio-temporal features of skeleton. First, all skeleton joints are selected as root nodes, and the neighborhoods of the root joints are assembled and normalized according to the body structure, which explicitly and interpretably expresses the spatial geometric relation of the skeleton joints. Second, we propose an attention-based assembly and normalization strategy to adaptively capture non-local joints. The adaptive assembly and normalization can avoid the dilution of key long-range features. Moreover, a bi-level aggregation strategy is introduced to learn spatio-temporal dependencies of joints, where the low-level aggregation aligns the normalized neighborhood graphs, and the high-level aggregation aggregates the features of neighbor nodes by a standard convolution kernel. In high-level aggregation, it is convenient to realize factorized spatio-temporal aggregation or unified spatio-temporal aggregation. Extensive experiments on four datasets with different numbers of action patterns demonstrate that our model achieves comparable performance with the state-of-the-art works.
科研通智能强力驱动
Strongly Powered by AbleSci AI