作者
Jingdong Yang,Jun Tu,Xiaolin Zhang,Shaoqing Yu,Zheng Xianyou
摘要
In Brief: An efficient image segmentation model, TSE DeepLab is proposed for clincial image segmentation on sinusitis and patellar fracture instances. TSE DeepLab replaces global average pooling with TSE block, which consists of visual Transformer in form of static visual tokens and SE block, in order to improve the ability of global feature extraction. • Transformer in form of static visual tokens is applied in clincial image to improve global feature extraction. • The hyperparameters of TSE block is optimized to speed up the convergence and improve segmentation performance. Medical image segmentation is the key research of precision medicine. The existing models often ignore some important pixel features and fail to effectively extract global correlation features, which causes poor performance of segmentation. In this paper, we propose TSE DeepLab, which retains the original atrous convolution for extraction of local feature on the basis of DeepLabv3 framework, converts the feature maps after backbone into visual tokens, and further feeds them into Transformer module to enhance the ability of global feature extraction. At the same time, squeeze and excitation components are added to sort the importance of channels after Transformer module, so that the model pays attention to the important pixel features of each channel. In this paper, we apply 5-fold cross-validation to study the clinical sinus instances of Shanghai Tongji Hospital affiliated to Tongji University and the patellar fracture instances of the Sixth People's Hospital affiliated to Shanghai Jiao Tong University. The average of evaluation measures achieves Accuracy of 99.74%, Precision of 93.67%, IOU of 88.10%, Specificity of 99.87%, Fl-score of 93.63%, Sensitivity of 93.82% on sinus dataset and Accuracy of 99.53%, Precision of 85.64%, IOU of 78.47%, Specificity of 99.72%, Fl-score of 87.15%, Sensitivity of 89.95% on patellar fracture dataset. Compared with various typical segmentation models, the proposed model attains better segmentation accuracy and generalization performance, and has better reference value for clinical medical diagnosis.