特征(语言学)
动作识别
计算机科学
人工智能
模式识别(心理学)
对偶(语法数字)
动作(物理)
融合
语音识别
计算机视觉
班级(哲学)
艺术
哲学
语言学
物理
文学类
量子力学
作者
Di Wu,Jun Wang,Wei Zou,Shengquan Zou,Juxiang Zhou,Jianhou Gan
标识
DOI:10.1016/j.cviu.2024.104068
摘要
The classroom teaching action recognition task refers to recognizing and understanding teacher action through video temporal and spatial information. Due to complex backgrounds and significant occlusions, recognizing teacher action in the classroom environment poses substantial challenges. In this study, we propose a classroom teacher action recognition approach based on a spatio-temporal dual-branch feature fusion architecture, where the core task involves utilizing continuous human keypoint heatmap information and single-frame image information. Specifically, we fuse features from two modalities to propose a method combining image spatial information with temporal human keypoint heatmap information for teacher action recognition. Our approach ensures recognition accuracy while reducing the model's parameters and computational complexity. Additionally, we constructed a teacher action dataset (CTA) in a real classroom environment, comprising 12 action categories, 13k+ video segments, and a total duration exceeding 15 h. The experimental results on the CTA dataset validate the effectiveness of our proposed method. Our research explores action recognition tasks in real complex classroom environments, providing a technical framework for classroom teaching intelligent analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI