计算机科学
频道(广播)
特征(语言学)
代表(政治)
网(多面体)
人工智能
CLs上限
模式识别(心理学)
钥匙(锁)
算法
电信
数学
政治
验光服务
医学
哲学
语言学
计算机安全
法学
政治学
几何学
作者
Mengfan Xue,Jiannan Zheng,Tao Li,Dongliang Peng
标识
DOI:10.1142/s0218001423560116
摘要
The modeling of channel and temporal information is of crucial importance for action recognition tasks. To build a high-performance action recognition network by effectively capturing channel and temporal information, we propose CLS-Net: an action recognition algorithm based on channel-temporal information modeling. The proposed CLS-Net characterizes channel and temporal information by inserting multiple modules to an end-to-end backbone network, including a channel attention module (CA module) for modeling channel information, a long-term temporal module (LT module) and a short-term temporal module (ST module) for modeling temporal information. Specifically, the CA module extracts the correlation between feature channels so the network can learn to selectively strengthen the features containing useful information and suppress the useless features through global information. The LT module moves some channels in the temporal dimension to realize information interaction across time domains and model global temporal information. The ST module enhances the motion-sensitive features by calculating the feature-level frame difference information and realizes the representation of local motion information. Since the multi-module insertion mode directly affects the whole model’s final performance, we propose a novel multi-module insertion mode instead of a simple series or parallel connection to ensure that the multiple modules can complement one another and cooperate with each other more efficiently. CLS-Net achieves SOTA performance on the EgoGesture and Jester dataset in the same type of network and achieves competitive results on the Something-Something V2 dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI