STFDiff: Remote sensing image spatiotemporal fusion with diffusion models

计算机科学图像融合扩散融合遥感图像（数学）计算机视觉人工智能地质学物理语言学热力学哲学

作者

He Huang,Wei He,Hongyan Zhang,Yu Xia,Liangpei Zhang

出处

期刊：Information Fusion [Elsevier BV]
日期：2024-06-07 卷期号：111: 102505-102505 被引量：6

标识

DOI：10.1016/j.inffus.2024.102505

摘要

Spatiotemporal fusion (STF) methods aim to blend satellite images with different spatial and temporal resolutions to support more frequent and precise monitoring. In the past decades, amounts of STF methods have been developed with remarkable success. However, among the existing methods, the traditional methods rely on the linear assumption and fail for complex and diverse scenes with great dynamics. The deep learning-based methods suffer from the spatial, temporal and spectral uncertainties in STF and the mode collapse problem of generative adversarial networks (GANs) for remote sensing images with complex scenes. To address these problems, we propose a novel spatiotemporal fusion method with diffusion models (STFDiff) that merges a coarse image at the prediction date and the coarse-fine image pairs acquired at other dates to generate the fine image at the prediction date. STFDiff generates the fine image via repeated refinement with initialized Gaussian noise under the control of the prior images acquired at other dates. At each iteration, the noise is predicted through a conditional noise predictor dual-stream Unet (DS-Unet), which enhances the noise features by subtracting the extracted features from the dual-stream encoders (DS-encoders). The noise is then gradually removed, and finally the fine image is generated with similar spatial details to the fine images and temporal dynamics to the coarse images. Comprehensive experiments on two public datasets and one personally collected dataset demonstrate that STFDiff outperforms state-of-the-art (SOTA) methods. To further verify the applicability of STFDiff on downstream tasks, we compared the K-means clustering results on the fusion images generated by different methods. The results show that the classification results of STFDiff are the most consistent with the actual images and obtain ∼2% mean intersection over union (mIoU) improvement over the SOTA methods. The source code is available at https://github.com/prowDIY/STF.

求助该文献

最长约 10秒，即可获得该文献文件

STFDiff: Remote sensing image spatiotemporal fusion with diffusion models

今日热心研友