计算机科学
判别式
语义学(计算机科学)
排名(信息检索)
相似性(几何)
图像(数学)
匹配(统计)
人工智能
图像检索
语义相似性
模式识别(心理学)
情报检索
数学
统计
程序设计语言
作者
Chunxiao Liu,Zhendong Mao,Wenyu Zang,Bin Wang
标识
DOI:10.1109/icassp.2019.8683869
摘要
Image-text matching has received a large amount of interest since it associates different modalities and improves the understanding of image and natural language. It aims to retrieval semantic related images based on the given text query, and vice versa. Existing approaches have achieved much progress by projecting the image and text into a common space where data with different semantics can be distinguished. However, they process all the data points uniformly, while neglecting that data in a neighborhood are harder to distinguish due to their visual similarity or syntactic structural similarity. To address this issue, we propose a neighbor-aware network to image-text matching where an intra-attention module and neighbor-aware ranking loss are proposed to jointly distinguish data with different semantics, more importantly, semantic unrelated data in a neighborhood can be distinguished. The intra-attention attends to discriminative parts by comparing data with different semantics and magnifying difference between them, especially subtle difference between data in a neighborhood. The neighbor-aware ranking loss function utilizes the magnified difference to explicitly and effectively discriminate data in a neighborhood. We conduct extensive experiments on several benchmarks and show that the proposed approach significantly outperforms the state-of-the-art.
科研通智能强力驱动
Strongly Powered by AbleSci AI