计算机科学
建筑
动作(物理)
动作识别
人工智能
人机交互
历史
量子力学
物理
考古
班级(哲学)
作者
Guangle Yao,Xianyuan Liu,Tao Leí
标识
DOI:10.1145/3265639.3265672
摘要
Video action recognition is widely applied in video indexing, intelligent surveil-lance, multimedia understanding, and other fields. Recently, it was greatly improved by incorporating the learning of deep information using convolutional neural network (ConvNet). In this paper, we proposed a 3D ConvNet-GRU architecture to learn deep information for action recognition. Specifically, we use 3D ConvNet to learn spatiotemporal information from short RGB clips and optical flow clips, and impose gated recurrent unit (GRU) on the spatiotemporal in-formation to model the temporal evolution for action recognition. The experimental results show that our 3D ConvNet-GRU method is effective to model temporal evolution for action and achieves recognition performance comparable to that of state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI