手势
计算机科学
手势识别
特征(语言学)
人工智能
棱锥(几何)
骨干网
模式识别(心理学)
任务(项目管理)
翻译(生物学)
桥(图论)
利用
计算机视觉
工程类
医学
计算机网络
哲学
语言学
物理
生物化学
化学
系统工程
计算机安全
信使核糖核酸
内科学
光学
基因
作者
Hao Liang,Lunke Fei,Shuping Zhao,Jie Wen,Shaohua Teng,Yong Xu
标识
DOI:10.1016/j.patcog.2023.109901
摘要
Hand gesture recognition from images is a longstanding computer vision task that can be used to build a potential bridge for human-computer interaction and sign language translation. For number of methods proposed for hand gesture recognition (HGR); however, difficult scenarios such as different scales of hand gestures and complex backgrounds exist, making them less effective. In this paper, we propose an end-to-end multiscale feature learning network for HGR, which consists of a CNN-based backbone network, a feature aggregation pyramid network (FAPN) embedded with a two-stage expansion-squeeze-aggregation (ESA) module, and three task-specific prediction branches. First, the backbone network extracts multiscale features from the original hand gesture images. Furthermore, the FAPN embedded with two-stage ESA extensively exploits multiscale feature information and learns hand gesture-specific features at different scales. Then, the mask loss guides the network to locate hand-specific regions during the training stage, and finally, the classification and regression branches output the category and location of a hand gesture during the model training and prediction. The experimental results on two publicly available datasets show that the proposed method outperforms most state-of-the-art HGR methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI