Semantic Segmentation in Thermal Videos: A New Benchmark and Multi-Granularity Contrastive Learning-Based Framework

粒度 水准点(测量) 计算机科学 分割 人工智能 自然语言处理 机器学习 地理 地图学 程序设计语言
作者
Yu Zheng,Fugen Zhou,Shangying Liang,Wentao Song,Xiangzhi Bai
出处
期刊:IEEE Transactions on Intelligent Transportation Systems [Institute of Electrical and Electronics Engineers]
卷期号:24 (12): 14783-14799
标识
DOI:10.1109/tits.2023.3300038
摘要

Video semantic segmentation has achieved great success, which is significant for road scene understanding. However, semantic segmentation remains challenging in poor illumination and inclement weather. Thermal camera, highly invariant to light and highly penetrating to rain and fog, enables semantic segmentation to work under challenging conditions. Thus, this paper explores semantic segmentation in thermal videos to broaden the scope of the application of road scene understanding. We offer the first thermal video semantic segmentation dataset TVSS including 1695 thermal videos with 50850 frames in road scenes. It is available at: https://xzbai.buaa.edu.cn/datasets.html . TVSS is finely annotated by 17 categories at the frame rate of 1fps, with a labeled pixel density of 98.9%. Existing video semantic segmentation methods rely on the amount of labels and the representation power of backbones, which cannot achieve ideal results on thermal videos. Thus, we introduce a multi-granularity contrastive learning based thermal video semantic segmentation model (MGCL), which explores the abundant unlabeled frames to boost the supervised segmentation. Specifically, MGCL constructs multi-granularity self-supervised signals on unlabeled thermal videos by contrastive learning, including the intra-frame context generalization loss, the intra-clip temporal consistency loss, and the inter-video category discrimination loss. In addition, a hard anchor sampling strategy is introduced to focus on hard-classify pixels for further performance improvement. Extensive experiments on TVSS demonstrate the superior performance of MGCL in both accuracy and efficiency. Compared to the 12 state-of-the-art semantic segmentation methods, MGCL achieves 2.8% to 8.1% gains in mIoU performance while maintaining the inference speed.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
包容的剑发布了新的文献求助10
1秒前
1秒前
灵巧高山应助阿旭采纳,获得10
2秒前
qcl发布了新的文献求助30
3秒前
TOU发布了新的文献求助10
3秒前
4秒前
4秒前
5秒前
5秒前
6秒前
whh完成签到,获得积分10
6秒前
ccc发布了新的文献求助10
6秒前
6秒前
7秒前
7秒前
一程完成签到 ,获得积分10
7秒前
Luobing完成签到,获得积分10
7秒前
9秒前
vvvvvvv完成签到,获得积分10
9秒前
9秒前
9秒前
跳跃若风发布了新的文献求助10
10秒前
10秒前
结实涑发布了新的文献求助10
11秒前
陆靖易发布了新的文献求助10
11秒前
云云然完成签到,获得积分10
11秒前
11秒前
CHEN完成签到 ,获得积分10
13秒前
学术发布了新的文献求助10
13秒前
小闫发布了新的文献求助10
13秒前
云云然发布了新的文献求助10
14秒前
齐齐齐发布了新的文献求助10
14秒前
15秒前
16秒前
科研通AI2S应助沉默的妙竹采纳,获得10
16秒前
午马未羊完成签到 ,获得积分10
18秒前
科研通AI2S应助寒冷的大白采纳,获得10
19秒前
田様应助学术采纳,获得10
19秒前
NEKO完成签到 ,获得积分10
19秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Mechanistic Modeling of Gas-Liquid Two-Phase Flow in Pipes 2500
Structural Load Modelling and Combination for Performance and Safety Evaluation 800
Conference Record, IAS Annual Meeting 1977 610
Interest Rate Modeling. Volume 3: Products and Risk Management 600
Interest Rate Modeling. Volume 2: Term Structure Models 600
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3555160
求助须知:如何正确求助?哪些是违规求助? 3130863
关于积分的说明 9388950
捐赠科研通 2830329
什么是DOI,文献DOI怎么找? 1555932
邀请新用户注册赠送积分活动 726345
科研通“疑难数据库(出版商)”最低求助积分说明 715734