计算机科学
人工智能
卷积神经网络
卷积(计算机科学)
接头(建筑物)
特征(语言学)
动作识别
模式识别(心理学)
过程(计算)
图层(电子)
特征提取
动作(物理)
循环神经网络
人工神经网络
深度学习
机器学习
建筑工程
语言学
哲学
化学
物理
有机化学
量子力学
工程类
班级(哲学)
操作系统
作者
Yangyang Xu,Lei Wang,Jun Cheng,Haiying Xia,Jianqin Yin
标识
DOI:10.1109/compcomm.2017.8322825
摘要
In this paper, we propose a new architecture for human action recognition by using a convolution neural networks (CNN) and two Long Short-Term Memory(LSTM) networks with temporal-wise attention model. We call this network the Double LSTM with Temporal-wise Attention network (DTA). The features extracted by our model are both spatially and temporally. The attention model can learn which parts in which frames in a video are relevant to the video label and pay more attention on them. We designed a joint optimization layer (JOL) to jointly process two kinds of feature produced by two LSTMs. The proposed networks achieved improved performance on three widely used datasets — the UCF Sports dataset, the UCF11 dataset and the HMDB51 dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI