卫星
背景(考古学)
推论
几何学
遥感
计算机科学
分辨率(逻辑)
基本事实
高分辨率
大地测量学
计算机视觉
人工智能
数学
地质学
物理
天文
古生物学
出处
期刊:Cornell University - arXiv
日期:2024-07-10
标识
DOI:10.48550/arxiv.2407.08061
摘要
Predicting realistic ground views from satellite imagery in urban scenes is a challenging task due to the significant view gaps between satellite and ground-view images. We propose a novel pipeline to tackle this challenge, by generating geospecifc views that maximally respect the weak geometry and texture from multi-view satellite images. Different from existing approaches that hallucinate images from cues such as partial semantics or geometry from overhead satellite images, our method directly predicts ground-view images at geolocation by using a comprehensive set of information from the satellite image, resulting in ground-level images with a resolution boost at a factor of ten or more. We leverage a novel building refinement method to reduce geometric distortions in satellite data at ground level, which ensures the creation of accurate conditions for view synthesis using diffusion networks. Moreover, we proposed a novel geospecific prior, which prompts distribution learning of diffusion models to respect image samples that are closer to the geolocation of the predicted images. We demonstrate our pipeline is the first to generate close-to-real and geospecific ground views merely based on satellite images.
科研通智能强力驱动
Strongly Powered by AbleSci AI