计算机科学
约束(计算机辅助设计)
人工智能
图像(数学)
词(群论)
判决
匹配(统计)
强化学习
自然语言处理
模式识别(心理学)
数学
几何学
统计
作者
Jie Wu,Chunlei Wu,Jing Lü,Leiquan Wang,Xuerong Cui
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:32 (1): 388-397
被引量:28
标识
DOI:10.1109/tcsvt.2021.3060713
摘要
Image and sentence matching has attracted increasing attention since it is associated with two important modalities of vision and language. Previous methods aim to find the latent correspondences between image regions and words by aggregating the similarities of the region-word pairs. However, these approaches consider little about the relationships of diverse regions in the image and treat the similarities of all region-word pairs equally. Moreover, focusing on fine-grained alignment overly, the true meaning of the original image will be likely distorted. In this paper, a novel Region Reinforcement Network with Topic Constraint (RRTC) is proposed to explore the correspondences between images and texts. Specifically, the region reinforcement network is built to infer fine-grained correspondence by considering the relationships of regions and re-assigning region-word similarities. Meanwhile, the topic constraint module is presented to summarize the central theme of images, which constrains the original image deviation. Extensive experimental results on MSCOCO and Flickr30k datasets verify the effectiveness of our proposed RRTC.
科研通智能强力驱动
Strongly Powered by AbleSci AI