计算机科学
人工智能
计算机视觉
目标检测
运动(物理)
对象(语法)
模式识别(心理学)
作者
Wenjun Hui,Zhenfeng Zhu,Guanghua Gu,Meiqin Liu,Yao Zhao
标识
DOI:10.1109/tmm.2024.3361170
摘要
Video camouflaged object detection aims to identify objects that are visually concealed within the surroundings in a video. Most of the existing methods fall into analyzing the implicit inter-frame motion to capture the camouflaged object. However, due to a lack of exploring the prior explicit motion of the camouflaged object, these works generally encounter difficulty in capturing the complete camouflaged object. To address this issue, we propose to integrate implicit and explicit motion learning into a unified framework, namely Im plicit- Ex plicit Motion Learning network (IMEX), for video camouflaged object detection. Specifically, to promote the identifiability of the camouflaged object, a cross-scale representation fusion was proposed for global inter-frame alignment. By establishing cross-scale temporal-spatial association and aggregating the temporal-spatial attentive representations, it also achieves an elimination of the implicit motion of inter-frame to some extent. Moreover, to further improve the discriminability of boundary regions of the detected object, an explicit motion-induced consistency preserving of camouflaged objects is proposed, in which the prior boundary-aware explicit motion field is leveraged to supervise the consistency of camouflaged objects in consecutive frames. Extensive experiments show that our proposed IMEX achieves substantial performance improvements by a large margin.
科研通智能强力驱动
Strongly Powered by AbleSci AI