计算机科学
图像检索
任务(项目管理)
语义学(计算机科学)
人工智能
水准点(测量)
图像(数学)
情报检索
模态(人机交互)
删除
选择(遗传算法)
计算机视觉
自然语言处理
模式识别(心理学)
经济
地理
程序设计语言
管理
大地测量学
作者
Gangjian Zhang,Shikui Wei,Huaxin Pang,Shuang Qiu,Yao Zhao
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:31: 5976-5988
被引量:6
标识
DOI:10.1109/tip.2022.3204213
摘要
Composed image retrieval aims at retrieving the desired images, given a reference image and a text piece. To handle this task, two important subprocesses should be modeled reasonably. One is to erase irrelated details of the reference image against the text piece, and the other is to replenish the desired details in the image against the text piece. Nowadays, the existing methods neglect to distinguish between the two subprocesses and implicitly put them together to solve the composed image retrieval task. To explicitly and orderly model the two subprocesses of the task, we propose a novel composed image retrieval method which contains three key components, i.e., Multi-semantic Dynamic Suppression module (MDS), Text-semantic Complementary Selection module (TCS), and Semantic Space Alignment constraints (SSA). Concretely, MDS is to erase irrelated details of the reference image by suppressing its semantic features. TCS aims to select and enhance the semantic features of the text piece and then replenish them to the reference image. In the end, to facilitate the erasure and replenishment subprocesses, SSA aligns the semantics of the two modality features in the final space. Extensive experiments on three benchmark datasets (Shoes, FashionIQ, and Fashion200K) show the superior performance of our approach against state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI