Ta-Adapter: Enhancing few-shot CLIP with task-aware encoders

适配器(计算) 计算机科学 任务(项目管理) 弹丸 编码器 人工智能 计算机硬件 操作系统 工程类 化学 系统工程 有机化学
作者
Wen-bo ZHANG,Yifan Zhang,Yuyang Deng,Wenlong Zhang,Jianfeng Lin,Binqiang Huang,Jinlu Zhang,Wenhao Yu
出处
期刊:Pattern Recognition [Elsevier]
卷期号:153: 110559-110559 被引量:3
标识
DOI:10.1016/j.patcog.2024.110559
摘要

Contrastive Language-Image Pre-training (CLIP) has shown impressive zero-shot transfer capabilities, but its potential for specific downstream tasks is not fully utilized. To further enhance CLIP's few-shot capability for specific datasets, some subsequent works have been proposed, such as methods based on lightweight adapters and prompt learning. However, since CLIP is pretrained on a diverse collection of image and text pairs sourced from the internet, it is difficult to sufficiently tune models to specific datasets using only lightweight adaptions. In this paper, we argue that largely modifying the internal representations within CLIP's encoders can yield better results on downstream datasets. In this work, we introduce Ta-Adapter, a method that equips both the visual and textual encoders of CLIP with task-specific prompts. These prompts are generated using a collaborative prompt learning approach, which allows the encoders to produce representations that are better aligned with specific downstream datasets. Then, we initialize an adapter module using the optimized features generated by the task-aware visual encoder for further feature alignment, and this module can also be further fine-tuned. Our extensive experiments on image classification datasets show that compared to the state-of-the-art few-shot methods Tip-Adapter-F and MaPLe, our model exhibits good performance and obtains an average absolute gain of 2.04% and 1.62% on 11 different image recognition datasets, respectively. In conclusion, this work presents a unique and effective approach to unlocking the full potential of CLIP's few-shot learning capabilities.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
TTRRCEB发布了新的文献求助10
刚刚
鑫鑫发布了新的文献求助10
3秒前
tonyhuang完成签到,获得积分10
5秒前
NINI完成签到,获得积分10
10秒前
NINI发布了新的文献求助10
18秒前
腼腆的若雁完成签到 ,获得积分10
20秒前
Damia完成签到,获得积分10
24秒前
zzz发布了新的文献求助10
25秒前
认真的寒香完成签到 ,获得积分10
27秒前
机灵又蓝完成签到 ,获得积分10
29秒前
不会取名字完成签到,获得积分10
31秒前
可爱的函函应助西门博超采纳,获得10
32秒前
xjw发布了新的文献求助10
36秒前
Joanna完成签到 ,获得积分10
36秒前
39秒前
呼呼兔完成签到 ,获得积分10
39秒前
Rs完成签到,获得积分10
44秒前
茶醉蛋完成签到,获得积分10
45秒前
gege发布了新的文献求助20
49秒前
weiyongswust应助舟舟采纳,获得20
50秒前
51秒前
茶馆发布了新的文献求助10
52秒前
dsjacn完成签到 ,获得积分10
54秒前
木杉完成签到,获得积分10
55秒前
彭于晏应助彪壮的元柏采纳,获得10
56秒前
57秒前
jjjjchou完成签到 ,获得积分10
59秒前
踏实志泽完成签到,获得积分10
59秒前
李健的小迷弟应助Hui_2023采纳,获得10
1分钟前
1分钟前
Dore应助叶以亦采纳,获得30
1分钟前
1分钟前
1分钟前
1分钟前
hh发布了新的文献求助10
1分钟前
CipherSage应助科研通管家采纳,获得10
1分钟前
yaoping应助科研通管家采纳,获得10
1分钟前
天天快乐应助科研通管家采纳,获得10
1分钟前
ding应助科研通管家采纳,获得10
1分钟前
1分钟前
高分求助中
LNG地下式貯槽指針(JGA指-107-19)(Recommended practice for LNG inground storage) 1000
rhetoric, logic and argumentation: a guide to student writers 1000
QMS18Ed2 | process management. 2nd ed 1000
Eric Dunning and the Sociology of Sport 850
Operative Techniques in Pediatric Orthopaedic Surgery 510
A High Efficiency Grating Coupler Based on Hybrid Si-Lithium Niobate on Insulator Platform 500
Generalized Linear Mixed Models 第二版 500
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 免疫学 细胞生物学 电极
热门帖子
关注 科研通微信公众号,转发送积分 2921315
求助须知:如何正确求助?哪些是违规求助? 2563861
关于积分的说明 6935022
捐赠科研通 2221572
什么是DOI,文献DOI怎么找? 1180909
版权声明 588787
科研通“疑难数据库(出版商)”最低求助积分说明 577751