计算机科学
人工智能
卷积神经网络
模式识别(心理学)
卷积(计算机科学)
特征提取
动作识别
背景(考古学)
代表(政治)
特征(语言学)
边距(机器学习)
机器学习
人工神经网络
班级(哲学)
古生物学
语言学
哲学
政治
政治学
法学
生物
作者
Zufan Zhang,Zongming Lv,Chenquan Gan,Qingyi Zhu
标识
DOI:10.1016/j.neucom.2020.06.032
摘要
This paper aims to address the human action recognition issue by using convolutional long short-term memory networks (Conv-LSTM) and fully-connected LSTM (FC-LSTM) with different attentions. To this end, the spatial-temporal dual-attention network (STDAN), which is mainly composed of feature extraction, attention and fusion modules, is designed. Different from the features of high-level fully-connected layer mostly used in previous work, the features of convolution and fully-connected layers of convolutional neural network (CNN) are both extracted in STDAN, which can enrich the initial level of video representation. Besides, the Conv-LSTM and FC-LSTM are employed to handle the long-duration sequential features with different temporal context information. To reinforce the spatial-temporal attention ability, a temporal attention module (TAM) and a joint spatial-temporal attention module (JSTAM) are implemented. Through the principle components analysis (PCA) and features fusion, the potential of STDAN is effectively explored and weighted. Finally, the experimental results show that the proposed STDAN has better recognition performance than existing state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI