动作识别
计算机科学
水准点(测量)
深度学习
动作(物理)
人工智能
卷积神经网络
机器学习
编码(集合论)
计算
物理
算法
集合(抽象数据类型)
量子力学
程序设计语言
地理
班级(哲学)
大地测量学
作者
Yi Zhu,Xinyu Li,Chunhui Liu,Mehdi Zolfaghari,Yuanjun Xiong,Chongruo Wu,Zhi Zhang,Joseph Tighe,R. Manmatha,Mu Li
出处
期刊:Cornell University - arXiv
日期:2020-12-11
被引量:9
摘要
Video action recognition is one of the representative tasks for video understanding. Over the last decade, we have witnessed great advancements in video action recognition thanks to the emergence of deep learning. But we also encountered new challenges, including modeling long-range temporal information in videos, high computation costs, and incomparable results due to datasets and evaluation protocol variances. In this paper, we provide a comprehensive survey of over 200 existing papers on deep learning for video action recognition. We first introduce the 17 video action recognition datasets that influenced the design of models. Then we present video action recognition models in chronological order: starting with early attempts at adapting deep learning, then to the two-stream networks, followed by the adoption of 3D convolutional kernels, and finally to the recent compute-efficient models. In addition, we benchmark popular methods on several representative datasets and release code for reproducibility. In the end, we discuss open problems and shed light on opportunities for video action recognition to facilitate new research ideas.
科研通智能强力驱动
Strongly Powered by AbleSci AI