计算机科学
扩散
图像(数学)
人工智能
分割
图像增强
图像分割
计算机视觉
物理
热力学
作者
Z.-W. Dong,Genji Yuan,Zhen Hua,Jinjiang Li
标识
DOI:10.1016/j.eswa.2024.123549
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end-to-end framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model’s perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model’s attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state-of-the-art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI