计算机科学
人工智能
计算机视觉
图像处理
图像(数学)
图像分割
模式识别(心理学)
作者
Song Ze,Xudong Kang,Xiaohui Wei,Shutao Li,Haibo Liu
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-1
标识
DOI:10.1109/tip.2024.3453008
摘要
Image geo-localization aims to locate a query image from source platform (e.g., drones, street vehicle) by matching it with Geo-tagged reference images from the target platforms (e.g., different satellites). Achieving cross-modal or cross-view real-time (>30fps) image localization with the guaranteed accuracy in a unified framework remains a challenge due to the huge differences in modalities and views between the two platforms. In order to solve this problem, a novel fine-grained overlap estimation based image geo-localization method is proposed in this paper, the core of which is to estimate the salient and subtle overlapping regions in image pairs to ensure correct matching. Specifically, the high-level semantic features of input images are extracted by a deep convolutional neural network. Then, a novel overlap scanning module (OSM) is presented to mine the long-range spatial and channel dependencies of semantic features in various subspaces, thereby identifying fine-grained overlapping regions. Finally, we adopt the triplet ranking loss to guide the proposed network optimization so that the matching regions are as close as possible and the most mismatched regions are as far away as possible. To demonstrate the effectiveness of our FOENet, comprehensive experiments are conducted on three cross-view benchmarks and one cross-modal benchmark. Our FOENet yields better performance in various metrics and the recall accuracy at top 1 (R@1) is significantly improved, with a maximum improvement of 70.6%. In addition, the proposed model runs fast on a single RTX 6000, reaching real-time inference speed on all datasets, with the fastest being 82.3 FPS.
科研通智能强力驱动
Strongly Powered by AbleSci AI