地标
计算机科学
人工智能
模式识别(心理学)
水准点(测量)
深度学习
特征学习
变压器
多分辨率分析
计算机视觉
小波
小波变换
地图学
电压
离散小波变换
物理
量子力学
地理
作者
Thanaporn Viriyasaranon,Serie Ma,Jang‐Hwan Choi
标识
DOI:10.1007/978-3-031-43987-2_42
摘要
Accurate localization of anatomical landmarks has a critical role in clinical diagnosis, treatment planning, and research. Most existing deep learning methods for anatomical landmark localization rely on heatmap regression-based learning, which generates label representations as 2D Gaussian distributions centered at the labeled coordinates of each of the landmarks and integrates them into a single spatial resolution heatmap. However, the accuracy of this method is limited by the resolution of the heatmap, which restricts its ability to capture finer details. In this study, we introduce a multiresolution heatmap learning strategy that enables the network to capture semantic feature representations precisely using multiresolution heatmaps generated from the feature representations at each resolution independently, resulting in improved localization accuracy. Moreover, we propose a novel network architecture called hybrid transformer-CNN (HTC), which combines the strengths of both CNN and vision transformer models to improve the network's ability to effectively extract both local and global representations. Extensive experiments demonstrated that our approach outperforms state-of-the-art deep learning-based anatomical landmark localization networks on the numerical XCAT 2D projection images and two public X-ray landmark detection benchmark datasets. Our code is available at https://github.com/seriee/Multiresolution-HTC.git .
科研通智能强力驱动
Strongly Powered by AbleSci AI