作者
Shaofeng Wang,Xingjuan Xie,Luo Zhang,Seung-Eun Chang,F F Zuo,Y J Wang,Yuxing Bai
摘要
Objective: To develop a multi-classification orthodontic image recognition system using the SqueezeNet deep learning model for automatic classification of orthodontic image data. Methods: A total of 35 000 clinical orthodontic images were collected in the Department of Orthodontics, Capital Medical University School of Stomatology, from October to November 2020 and June to July 2021. The images were from 490 orthodontic patients with a male-to-female ratio of 49∶51 and the age range of 4 to 45 years. After data cleaning based on inclusion and exclusion criteria, the final image dataset included 17 453 face images (frontal, smiling, 90° right, 90° left, 45° right, and 45° left), 8 026 intraoral images [frontal occlusion, right occlusion, left occlusion, upper occlusal view (original and flipped), lower occlusal view (original and flipped) and coverage of occlusal relationship], 4 115 X-ray images [lateral skull X-ray from the left side, lateral skull X-ray from the right side, frontal skull X-ray, cone-beam CT (CBCT), and wrist bone X-ray] and 684 other non-orthodontic images. A labeling team composed of orthodontic doctoral students, associate professors, and professors used image labeling tools to classify the orthodontic images into 20 categories, including 6 face image categories, 8 intraoral image categories, 5 X-ray image categories, and other images. The data for each label were randomly divided into training, validation, and testing sets in an 8∶1∶1 ratio using the random function in the Python programming language. The improved SqueezeNet deep learning model was used for training, and 13 000 natural images from the ImageNet open-source dataset were used as additional non-orthodontic images for algorithm optimization of anomaly data processing. A multi-classification orthodontic image recognition system based on deep learning models was constructed. The accuracy of the orthodontic image classification was evaluated using precision, recall, F1 score, and confusion matrix based on the prediction results of the test set. The reliability of the model's image classification judgment logic was verified using the gradient-weighted class activation mapping (Grad-CAM) method to generate heat maps. Results: After data cleaning and labeling, a total of 30 278 orthodontic images were included in the dataset. The test set classification results showed that the precision, recall, and F1 scores of most classification labels were 100%, with only 5 misclassified images out of 3 047, resulting in a system accuracy of 99.84%(3 042/3 047). The precision of anomaly data processing was 100% (10 500/10 500). The heat map showed that the judgment basis of the SqueezeNet deep learning model in the image classification process was basically consistent with that of humans. Conclusions: This study developed a multi-classification orthodontic image recognition system for automatic classification of 20 types of orthodontic images based on the improved SqueezeNet deep learning model. The system exhibitted good accuracy in orthodontic image classification.目的: 基于深度学习开发用于正畸图像数据自动分类的多分类正畸图像识别模型,为正畸图像数据管理提供参考。 方法: 收集2020年10至11月和2021年6至7月首都医科大学口腔医学院正畸科采集的35 000张正畸临床图像,图像全部来自于490例正畸治疗患者,男女性别比例为49∶51,年龄范围为4~45岁。根据纳入及排除标准进行数据清洗,最终纳入数据集中的图像数据包括面像17 453张(包括正面像、正面微笑像、右侧90°面像、左侧90°面像、右侧45°面像和左侧45°面像)、口内像8 026张[包括正面(牙合)像、右侧(牙合)像、左侧(牙合)像、上颌(牙合)面像(原始)、上颌(牙合)面像(翻转后)、下颌(牙合)面像(原始)、下颌(牙合)面像(翻转后)、覆(牙合)覆盖像]、X线片4 115张[包括头颅侧位X线片(左侧)、头颅侧位X线片(右侧)、头颅正位X线片、曲面体层X线片以及手腕骨X线片]、其他非正畸图像684张。由正畸专业博士研究生、副主任医师、主任医师共同组成标注团队,使用图像标注工具对正畸图像进行分类标注。图像类别包括6类面像、8类口内像、5类X线片以及其他图像,共计20种分类标签。每个标签的数据按8∶1∶1的比例利用Pthyon计算机语言中的Random函数随机分为训练集、验证集和测试集,使用改进的SqueezeNet网络(一种深度学习模型)进行训练,使用ImageNet自然图片开源数据集中的13 000张作为额外的非正畸图像进行异常数据处理的算法优化,构建基于深度学习模型的多分类正畸图像识别模型。根据测试集的预测结果,利用精确率、召回率、F1分数以及混淆矩阵作为正畸图像分类准确性的指标,评价该模型的预测能力。使用梯度加权分类激活映射方法生成热力图,验证该模型进行图像分类判断逻辑的可靠性。 结果: 通过数据清洗和标签标注,共30 278张正畸图像纳入数据集。测试集分类结果显示,多数分类标签的精确率、召回率以及F1分数为100%,3 047张图像中仅5张分类错误,模型精确率达99.84%(3 042/3 047)。而异常数据处理的精确率达100%(10 500/10 500)。热力图显示,多分类正畸图像识别模型在图像分类过程中的判断依据与人类在判断该图像分类时基本一致。 结论: 本项研究基于改进后的SqueezeNet网络构建了一种可用于20种正畸图像自动分类的多分类正畸图像识别模型,该模型的图像分类准确性较好。.