计算机科学
手语
冗余(工程)
帧(网络)
人工智能
编码器
语音识别
特征提取
解码方法
特征(语言学)
模式识别(心理学)
符号(数学)
算法
电信
数学分析
哲学
语言学
数学
操作系统
标识
DOI:10.1109/prai59366.2023.10332074
摘要
Continuous sign language recognition (CSLR) is a many-to-many sequence learning task, and the commonly used method is to extract features and learn sequences from sign language videos through an encoding-decoding network. However, the effective feature information contained in continuous sign language recognition video frames is relatively small, and there is a problem of insufficient feature extraction due to the excessive redundancy information in the frames. Meanwhile, each frame's importance for sign language recognition varies, which directly affects the accuracy of CSLR results. Therefore, this paper proposes a continuous sign language recognition method based on attention mechanism. Firstly, efficient channel attention (ECA) is incorporated into the residual network in the encoder to allow the network to extract more useful information from each frame feature. The activation function used is HardSwish, which further improves the accuracy of the network. Next, the decoder uses a Long Short-Term Memory (LSTM) combined with time attention mechanism, which assigns different weights to different frames, based on their importance. Finally, we tested our model on the CSL100 dataset and achieved competitive results. The experimental results demonstrate that the attention mechanism we introduced is effective in improving the accuracy of continuous sign language recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI