作者
Elzat Elham-Yilizati Yilihamu,Jun Shang,Zhihai Su,Jintao Yang,Kun Zhao,Hai Zhong,Shiqing Feng
摘要
Abstract Purpose Application of a deep learning model visualization plugin for rapid and accurate automatic quantification and classification of lumbar disc herniation (LDH) types on axial T2-weighted MRIs. Methods Retrospective analysis of 2500 patients, with the training set comprising data from 2120 patients (25,554 images), an internal test set covering data from 80 patients (784 images), and an external test set including data from 300 patients (3285 images). To enhance implementation, this study categorized normal and bulging discs as a grade without significant abnormalities, defining the region and severity grades of LDH based on the relationship between the disc and the spinal canal. The automated detection training and validation process employed the YOLOv8 object detection model for target area localization, the YOLOv8-seg segmentation model for disc recognition, and the YOLOv8-pose keypoint detection model for positioning. Finally, the stability of the detection results was verified using metrics such as Intersection over Union (IoU), mean error (ME), precision (P), F1 score (F1), Kappa coefficient (kappa), and 95% confidence interval (95%CI). Results The segmentation model achieved an mAP50:95 of 98.12% and an IoU of 98.36% in the training set, while the keypoint detection model achieved an mAP50:95 of 93.58% with a mean error (ME) of 0.208 mm. For the internal and external test sets, the segmentation model’s IoU was 97.58 and 97.49%, respectively, while the keypoint model’s ME was 0.219 mm and 0.221 mm, respectively. In the quantification validation of the extent of LDH, P, F1, and kappa were measured. For LDH classification (18 categories), the internal and external test sets showed P = 81.21% and 74.50%, F1 = 81.26% and 74.42%, and kappa = 0.75 (95%CI 0.68, 0.82, p = 0.00) and 0.69 (95%CI 0.65, 0.73, p = 0.00), respectively. For the severity grades of LDH (four categories), the internal and external test sets showed P = 92.51% and 90.07%, F1 = 92.36% and 89.66%, and kappa = 0.88 (95%CI 0.80, 0.96, p = 0.00) and 0.85 (95%CI 0.81, 0.89, p = 0.00), respectively. For the regions of LDH (eight categories), the internal and external test sets showed P = 83.34% and 77.87%, F1 = 83.85% and 78.21%, and kappa = 0.77 (95%CI 0.70, 0.85, p = 0.00) and 0.71 (95%CI 0.67, 0.75, p = 0.00), respectively. Conclusion The automated aided diagnostic model achieved high performance in detecting and classifying LDH and demonstrated substantial consistency with expert classification.