Cross-domain Detection Transformer based on Spatial-aware and Semantic-aware Token Alignment

计算机科学 安全性令牌 变压器 人工智能 目标检测 编码器 鉴别器 数据挖掘 机器学习 模式识别(心理学) 计算机网络 探测器 操作系统 物理 电信 电压 量子力学
作者
Jinhong Deng,Xiaoyue Zhang,Wen Li,Lixin Duan,Dong Xu
出处
期刊:IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
卷期号:: 1-12 被引量:1
标识
DOI:10.1109/tmm.2023.3330524
摘要

Detection transformers such as DETR [1] have recently exhibited promising performance for many object detection tasks, but the generalization ability of those methods is still quite limited for cross-domain adaptation scenarios. To address the cross-domain issue, a straightforward method is to perform token alignment with adversarial training in transformers. However, its performance is often unsatisfactory because the tokens in detection transformers are quite diverse and represent different spatial and semantic information. In this paper, we propose a new method for cross-domain detection transformers called spatial-aware and semantic-aware token alignment (SSTA). Specifically, we take advantage of the characteristics of cross-attention as used in the detection transformer and propose spatial-aware token alignment (SpaTA) and semantic-aware token alignment (SemTA) strategies to guide the token alignment across domains. For spatial-aware token alignment, we extract the information from the cross-attention map (CAM) to align the distribution of tokens according to their attention to object queries. For semantic-aware token alignment, we inject the category information into the cross-attention map and construct domain embedding to guide the learning of a multi-class discriminator to model the category relationship and achieve category-level token alignment during the entire adaptation process. We conduct extensive experiments on several widely-used benchmarks, and the results clearly show the effectiveness of our proposed approach over existing state-of-the-art methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
1秒前
2秒前
蒋时晏应助陶醉薯片采纳,获得30
2秒前
2秒前
执着的灯泡完成签到,获得积分10
2秒前
睡到自然醒完成签到 ,获得积分10
3秒前
3秒前
3秒前
3秒前
Musen完成签到,获得积分10
3秒前
科研通AI5应助叫滚滚采纳,获得10
3秒前
3秒前
123456发布了新的文献求助10
3秒前
大方安白发布了新的文献求助10
4秒前
Hello应助正直冰露采纳,获得10
4秒前
lyy完成签到 ,获得积分10
5秒前
沈随便发布了新的文献求助10
5秒前
5秒前
5秒前
6秒前
灵巧荆发布了新的文献求助10
6秒前
丘奇发布了新的文献求助10
6秒前
6秒前
6秒前
通~发布了新的文献求助10
7秒前
7秒前
搜集达人应助FloppyWow采纳,获得10
7秒前
Musen发布了新的文献求助10
7秒前
pluto应助金宝采纳,获得10
8秒前
ii完成签到 ,获得积分10
8秒前
温言发布了新的文献求助10
8秒前
CodeCraft应助务实盼海采纳,获得10
9秒前
orixero应助JUSTs0so采纳,获得10
9秒前
10秒前
欣欣子完成签到 ,获得积分10
10秒前
顺利毕业发布了新的文献求助10
10秒前
西奥完成签到 ,获得积分10
10秒前
11秒前
春分夏至完成签到,获得积分10
11秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Social media impact on athlete mental health: #RealityCheck 1020
Ensartinib (Ensacove) for Non-Small Cell Lung Cancer 1000
Unseen Mendieta: The Unpublished Works of Ana Mendieta 1000
Bacterial collagenases and their clinical applications 800
El viaje de una vida: Memorias de María Lecea 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3527742
求助须知:如何正确求助?哪些是违规求助? 3107867
关于积分的说明 9286956
捐赠科研通 2805612
什么是DOI,文献DOI怎么找? 1540026
邀请新用户注册赠送积分活动 716884
科研通“疑难数据库(出版商)”最低求助积分说明 709762