Jie Zhu,Yuan Zong,Hongli Chang,Yushun Xiao,Li Zhao
出处
期刊:IEEE Signal Processing Letters [Institute of Electrical and Electronics Engineers] 日期:2022-01-01卷期号:29: 2073-2077被引量:12
标识
DOI:10.1109/lsp.2022.3211200
摘要
Despite a lot of work in excavating the emotion descriptor from the hidden information, learning an effective spatiotemporal feature is a challenging issue for micro-expression recognition due to the fact that the micro-expression has a small difference in dynamic change and occurs in localized facial regions. Therefore, these properties of micro-expression suggest that the representation is sparse in the spatiotemporal domain. In this letter, a high-performance spatiotemporal feature learning based on sparse transformer is presented to solve the above issue. We extract the strong associated spatiotemporal feature by distinguishing the spatial attention map and attentively fusing the temporal feature. Thus, the feature map extracted from the critical relation will be fully utilized, while the superfluous relation will be masked. Our proposed method achieves remarkable results compared to state-of-the-art methods, proving that the sparse representation can be successfully integrated into the self-attention mechanism for micro-expression recognition.