计算机科学
可解释性
过度拟合
手语
人工智能
边距(机器学习)
特征(语言学)
特征学习
推论
代表(政治)
水准点(测量)
机器学习
手势
符号(数学)
模式识别(心理学)
自然语言处理
人工神经网络
哲学
数学分析
法学
地理
政治
语言学
数学
政治学
大地测量学
作者
Hezhen Hu,Wengang Zhou,Houqiang Li
出处
期刊:Proceedings of the ... AAAI Conference on Artificial Intelligence
[Association for the Advancement of Artificial Intelligence (AAAI)]
日期:2021-05-18
卷期号:35 (2): 1558-1566
被引量:37
标识
DOI:10.1609/aaai.v35i2.16247
摘要
Hand gestures play a dominant role in the expression of sign language. Current deep-learning based video sign language recognition (SLR) methods usually follow a data-driven paradigm under the supervision of the category label. However, those methods suffer limited interpretability and may encounter the overfitting issue due to limited sign data sources. In this paper, we introduce the hand prior and propose a new hand-model-aware framework for isolated SLR with the modeling hand as the intermediate representation. We first transform the cropped hand sequence into the latent semantic feature. Then the hand model introduces the hand prior and provides a mapping from the semantic feature to the compact hand pose representation. Finally, the inference module enhances the spatio-temporal pose representation and performs the final recognition. Due to the lack of annotation on the hand pose under current sign language datasets, we further guide its learning by utilizing multiple weakly-supervised losses to constrain its spatial and temporal consistency. To validate the effectiveness of our method, we perform extensive experiments on four benchmark datasets, including NMFs-CSL, SLR500, MSASL and WLASL. Experimental results demonstrate that our method achieves state-of-the-art performance on all four popular benchmarks with a notable margin.
科研通智能强力驱动
Strongly Powered by AbleSci AI