计算机科学
判别式
人工智能
分割
注释
帧(网络)
模式识别(心理学)
方案(数学)
机器学习
数学分析
电信
数学
作者
Yeyan Ning,Fēi Li,Mengping Dong,Zhenbo Li
标识
DOI:10.1007/978-3-031-44195-0_39
摘要
Video Instance Segmentation (VIS) aims to detect, segment, and track instances appearing in a video. To reduce annotation costs, some existing VIS methods use the Weakly Supervised Scheme (WSVIS). However, those WSVIS methods usually run in an offline manner, which fails in handling ongoing and long videos due to the limited computational resources. It would be considerable benefits if online models could match or surpass the performance of offline models. In this paper, we propose OWS-Seg, an end-to-end, simple, and efficient online WSVIS network with box annotations. Concretely, OWS-Seg consists of two novel contrastive learning branches: the Instance Contrastive Learning (ICL) branch learns instance level discriminative features to distinguish different instances in each frame, and the Mask Contrastive Learning (MCL) branch with Boxccam learns pixel level discriminative features to differentiate foreground and background. Experimental results show that OWS-Seg achieves promising performance, e.g., 43.5% AP on YouTube-VIS 2019, 36.6% AP on YouTube-VIS 2021, and 21.9% AP on OVIS. Besides, OWS-Seg achieves comparable performance to offline WSVIS and surpasses recent fully supervised methods, demonstrating its wide range of practical applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI