计算机科学
体积热力学
计算机视觉
人工智能
计算机图形学(图像)
实时计算
量子力学
物理
作者
Yucan Wang,Zhenzhen Wang,Haishan Tian,Yifan Song,Yangjie Cao,Zhiyan Wei
标识
DOI:10.1016/j.engappai.2024.107852
摘要
Although past learning-based Multi-View Stereo methods performed well, they still struggle to reconstruct regions with occlusions or weak textures. In this paper, we propose a Multi-View Stereo Net using attention mechanism and multi-source cost volume, namely AMS-MVSNet. We first introduce an improved multi-level feature pyramid net (FPN) structure to achieve smoother feature transitions in three stages, and establish additional connections between features that with larger scale difference. This can enhance the fusion of features extracted at different stages. In addition, we construct an attention-enhanced module, which can assign different weights according to the rendering effect of the same spatial point in different views. This can effectively alleviate the impact of false matches caused by weak textures or occlusions during cost volume construction. Furthermore, we utilize a multi-source cost volume that not only incorporates the matching information computed from each view group, but also introduces the depth map differences obtained from different views. The multi-source cost volume greatly enrichs the generalization ability of neural network. Lastly, our network architecture employs a Gated Recurrent Unit (GRU) to reduce memory pressure during the depth inference process and improve efficiency. Our quantitative and qualitative testing results on the DTU, Tanks & Temples and BlendedMVS datasets demonstrate the excellent performance of our neural network.
科研通智能强力驱动
Strongly Powered by AbleSci AI