计算机科学
人工智能
运动估计
杠杆(统计)
离群值
卷积神经网络
由运动产生的结构
运动场
背景(考古学)
计算机视觉
领域(数学)
模式识别(心理学)
数学
古生物学
纯数学
生物
标识
DOI:10.1109/tpami.2023.3334515
摘要
Multilayer perceptron (MLP) has become the de facto backbone in two-view correspondence learning, for it can extract effective deep features from unordered correspondences individually. However, the problem of natively lacking context information limits its performance although many context-capturing modules are appended in the follow-up studies. In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage a convolutional neural network (CNN) as the backbone, inherently capable of context aggregation. Specifically, with the observation that sparse motion vectors and a dense motion field can be converted into each other with interpolating and sampling, we regularize the putative motion vectors by estimating the dense motion field implicitly, then rectify the errors caused by outliers in local areas with CNN, and finally obtain correct motion vectors from the rectified motion field. Moreover, we propose global information injection and bilateral convolution, to fit the overall spatial transformation better and accommodate the discontinuities of the motion field in case of large scene disparity. Extensive experiments reveal that ConvMatch consistently outperforms state-of-the-arts for relative pose estimation, homography estimation, and visual localization.
科研通智能强力驱动
Strongly Powered by AbleSci AI