发布文献求助

Cross-modal Generation and Alignment via Attribute-guided Prompt for Unsupervised Text-based Person Retrieval

计算机科学情态动词人工智能自然语言处理情报检索模式识别（心理学）化学高分子化学

作者

Zongyi Li,Jianbo Li,Yuxuan Shi,Hefei Ling,Jiazhong Chen,Runsheng Wang,Shijuan Huang

标识

DOI：10.24963/ijcai.2024/116

摘要

Text-based Person Search aims to retrieve a specified person using a given text query. Current methods predominantly rely on paired labeled image-text data to train the cross-modality retrieval model, necessitating laborious and time-consuming labeling. In response to this challenge, we present the Cross-modal Generation and Alignment via Attribute-guided Prompt framework (GAAP) for fully unsupervised text-based person search, utilizing only unlabeled images. Our proposed GAAP framework consists of two key parts: Attribute-guided Prompt Caption Generation and Attribute-guided Cross-modal Alignment module. The Attribute-guided Prompt Caption Generation module generates pseudo text labels by feeding the attribute prompts into a large-scale pre-trained vision-language model. These synthetic texts are then meticulously selected through a sample selection, ensuring the reliability for subsequent fine-tuning. The Attribute-guided Cross-modal Alignment module encompasses three sub-modules for feature alignment across modalities. Firstly, Cross-Modal Center Alignment (CMCA) aligns the samples with different modality centroids. Subsequently, to address ambiguity arising from local attribute similarities, an Attribute-guided Image-Text Contrastive Learning module (AITC) is proposed to facilitate the alignment of relationships among different pairs by considering local attribute similarities. Lastly, the Attribute-guided Image-Text Matching (AITM) module is introduced to mitigate noise in pseudo captions by using the image-attribute matching score to soften the hard matching labels. Empirical results showcase the effectiveness of our method across various text-based person search datasets under the fully unsupervised setting.

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 笑弯了眼完成签到，获得积分10

1秒前; 万能图书馆的应助被陶醉觅夏采纳，获得10

2秒前; 虚幻浩宇发布了新的文献求助10

4秒前; 美好斓上传了应助文件

4秒前; 明亮夏兰完成签到，获得积分20

5秒前; 怕黑傲柏完成签到，获得积分10

6秒前; 文静的紫萱完成签到，获得积分10

8秒前; Gao.关闭了Gao.的文献求助

9秒前; 不配.上传了应助文件

9秒前; 小龙女完成签到，获得积分10

11秒前; 慕青的应助被洗剪吹采纳，获得10

11秒前; Akim的应助被PSPI采纳，获得10

11秒前; 虚幻浩宇完成签到，获得积分10

13秒前; 怕黑傲柏发布了新的文献求助10

14秒前; bkagyin的应助被gaozheng采纳，获得10

14秒前; 香蕉觅云上传了应助文件

20秒前; 大个上传了应助文件

21秒前; 不配.上传了应助文件

21秒前; janbuly完成签到，获得积分10

22秒前; vivi完成签到，获得积分10

22秒前; 彭于晏上传了应助文件

24秒前; 文武发布了新的文献求助10

25秒前; 明亮夏兰发布了新的文献求助10

25秒前; Liang完成签到，获得积分10

25秒前; 善良涵易完成签到，获得积分10

26秒前; 大方的尔烟完成签到，获得积分20

26秒前; 方方是小猪完成签到，获得积分10

27秒前; 我是老大的应助被焚心绚华绘采纳，获得10

28秒前; wsh发布了新的文献求助10

29秒前; 粘牙牛轧糖发布了新的文献求助10

30秒前; 小二郎的应助被文武采纳，获得10

30秒前; 研友_Zb151n完成签到，获得积分10

31秒前; lxz完成签到，获得积分10

32秒前; 可爱的函函的应助被han采纳，获得10

32秒前; 华仔的应助被陶醉觅夏采纳，获得10

33秒前; juni12上传了应助文件

33秒前; 秋秋完成签到，获得积分10

33秒前; 科研通AI2S的应助被wsh采纳，获得10

35秒前; 传奇3上传了应助文件

36秒前; 修水县1个科研人完成签到，获得积分10

36秒前

高分求助中: Sustainability in Tides Chemistry 2000; Дружба 友好报 (1957-1958) 1000; The Data Economy: Tools and Applications 1000; Mantiden - Faszinierende Lauerjäger – Buch gebraucht kaufen 600; PraxisRatgeber Mantiden., faszinierende Lauerjäger. – Buch gebraucht kaufe 600; A Dissection Guide & Atlas to the Rabbit 600; Revolution und Konterrevolution in China [by A. Losowsky] 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3110714; 求助须知：如何正确求助？哪些是违规求助？ 2760951; 关于积分的说明 7663297; 捐赠科研通 2415916; 什么是DOI，文献DOI怎么找？ 1282142; 科研通“疑难数据库（出版商）”最低求助积分说明 618920; 版权声明 599478

今日热心研友

抱住仙人球

坚强的广山

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2024 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：826996720【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通