计算机科学
编码器
分割
变压器
人工智能
解码方法
特征提取
模式识别(心理学)
算法
量子力学
操作系统
物理
电压
作者
Cheng Wang,Le Wang,Nuoqi Wang,Xiaoling Wei,Ting Feng,Minfeng Wu,Qi Yao,Rongjun Zhang
标识
DOI:10.1016/j.compbiomed.2023.107803
摘要
Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.
科研通智能强力驱动
Strongly Powered by AbleSci AI