Parameter-Efficient Transfer Learning for Remote Sensing Image–Text Retrieval

计算机科学 学习迁移 水准点(测量) 人工智能 图像检索 任务(项目管理) 机器学习 上下文图像分类 深度学习 模式识别(心理学) 图像(数学) 大地测量学 经济 管理 地理
作者
Yuan Yuan,Yang Zhan,Zhitong Xiong
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers]
卷期号:61: 1-14 被引量:15
标识
DOI:10.1109/tgrs.2023.3308969
摘要

Vision-and-language pre-training (VLP) models have experienced a surge in popularity recently. By fine-tuning them on specific datasets, significant performance improvements have been observed in various tasks. However, full fine-tuning of VLP models not only consumes a significant amount of computational resources but also has a significant environmental impact. Moreover, as remote sensing (RS) data is constantly being updated, full fine-tuning may not be practical for real-world applications. To address this issue, in this work, we investigate the parameter-efficient transfer learning (PETL) method to effectively and efficiently transfer visual-language knowledge from the natural domain to the RS domain on the image-text retrieval task. To this end, we make the following contributions. 1) We construct a novel and sophisticated PETL framework for the RS image-text retrieval (RSITR) task, which includes the pretrained CLIP model, a multimodal remote sensing adapter, and a hybrid multi-modal contrastive (HMMC) learning objective; 2) To deal with the problem of high intra-modal similarity in RS data, we design a simple yet effective HMMC loss; 3) We provide comprehensive empirical studies for PETL-based RS image-text retrieval. Our results demonstrate that the proposed method is promising and of great potential for practical applications. 4) We benchmark extensive state-of-the-art PETL methods on the RSITR task. Our proposed model only contains 0.16M training parameters, which can achieve a parameter reduction of 98.9% compared to full fine-tuning, resulting in substantial savings in training costs. Our retrieval performance exceeds traditional methods by 7-13% and achieves comparable or better performance than full fine-tuning. This work can provide new ideas and useful insights for RS vision-language tasks.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
CodeCraft应助仄言采纳,获得10
2秒前
周周完成签到 ,获得积分10
2秒前
YYY完成签到,获得积分10
5秒前
哈哈完成签到 ,获得积分10
6秒前
舒适砖头发布了新的文献求助10
7秒前
niu完成签到 ,获得积分10
8秒前
年轻千愁完成签到 ,获得积分10
8秒前
提桶跑路完成签到 ,获得积分10
10秒前
10秒前
SJ发布了新的文献求助20
11秒前
小二郎应助李瓜采纳,获得10
13秒前
kaustal完成签到,获得积分10
14秒前
fan发布了新的文献求助10
16秒前
16秒前
神仙也抠脚丫完成签到,获得积分10
20秒前
zl发布了新的文献求助10
21秒前
大模型应助橘涂初九采纳,获得10
21秒前
情怀应助安详的盼波采纳,获得10
23秒前
25秒前
yaya完成签到 ,获得积分10
26秒前
MMM完成签到 ,获得积分10
29秒前
柳絮旭发布了新的文献求助10
30秒前
31秒前
詹四娘完成签到,获得积分20
33秒前
35秒前
风云鱼完成签到,获得积分10
35秒前
36秒前
36秒前
詹四娘发布了新的文献求助10
36秒前
sss完成签到,获得积分10
38秒前
科研通AI2S应助科研通管家采纳,获得10
38秒前
烟花应助科研通管家采纳,获得10
38秒前
领导范儿应助科研通管家采纳,获得10
38秒前
所所应助科研通管家采纳,获得10
38秒前
隐形曼青应助科研通管家采纳,获得10
38秒前
大模型应助科研通管家采纳,获得10
39秒前
华仔应助科研通管家采纳,获得10
39秒前
科研通AI2S应助科研通管家采纳,获得10
39秒前
我是老大应助科研通管家采纳,获得10
39秒前
高分求助中
Earth System Geophysics 1000
Co-opetition under Endogenous Bargaining Power 666
Studies on the inheritance of some characters in rice Oryza sativa L 600
Medicina di laboratorio. Logica e patologia clinica 600
Sarcolestes leedsi Lydekker, an ankylosaurian dinosaur from the Middle Jurassic of England 500
《关于整治突出dupin问题的实施意见》(厅字〔2019〕52号) 500
Language injustice and social equity in EMI policies in China 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3211782
求助须知:如何正确求助?哪些是违规求助? 2860609
关于积分的说明 8125098
捐赠科研通 2526487
什么是DOI,文献DOI怎么找? 1360316
科研通“疑难数据库(出版商)”最低求助积分说明 643182
邀请新用户注册赠送积分活动 615273