计算机科学
分割
人工智能
计算机视觉
像素
特征(语言学)
班级(哲学)
对象(语法)
钥匙(锁)
计算
模式识别(心理学)
图像分割
算法
计算机安全
语言学
哲学
作者
Mehwish Awan,Jitae Shin
标识
DOI:10.1016/j.cviu.2023.103664
摘要
Weakly supervised multi-class video segmentation is one of the most challenging yet least studied research problems in computer vision. This study aims to investigate two main items: (1) effective feature update for temporal changes combined with feature reuse between temporal frames; and (2) learn object patterns in complex scenes specifically for videos under weak supervision. Associating image tags to visual appearance is not a straightforward learning task, especially for complex scenes. Therefore, in this paper, we present manifold augmentations to obtain reliable pixel labels from image tags. We propose a framework comprised of two key modules: a temporal split module for efficient video processing and a pseudo per-pixel seed generation module for precise pixel-level supervision. Particularly, in our model, we utilize and explore temporal correlations via temporal split module and temporal attention. To reuse the extracted features and incorporate temporal updates for precise and fast computation, a channel-wise temporal split mechanism between successive video frames is presented. Furthermore, we evaluated proposed modules in two additional settings: (1) fully or sparsely supervised road scene video segmentation; and (2) weakly supervised segmentation for complex road scene images. Experiments are conducted on the Cityscapes and CamVid datasets, using DeepLabv3 as segmentation network and LiteFlowNet to compute motion vectors.
科研通智能强力驱动
Strongly Powered by AbleSci AI