作者
Ri Qi Su,Jian Song,Zheng Wang,Shuang Mao,Yun Mao,Xuewen Wu,Muzhou Hou
摘要
Chronic suppurative otitis media (CSOM) and middle ear cholesteatoma (MEC) are the 2 most common chronic middle ear diseases. In the process of diagnosis and treatment, the 2 diseases are prone to misdiagnosis and missed diagnosis due to their similar clinical manifestations. High resolution computed tomography (HRCT) can clearly display the fine anatomical structure of the temporal bone, accurately reflect the middle ear lesions and the extent of the lesions, and has advantages in the differential diagnosis of chronic middle ear diseases. This study aims to develop a deep learning model for automatic information extraction and classification diagnosis of chronic middle ear diseases based on temporal bone HRCT image data to improve the classification and diagnosis efficiency of chronic middle ear diseases in clinical practice and reduce the occurrence of missed diagnosis and misdiagnosis.The clinical records and temporal bone HRCT imaging data for patients with chronic middle ear diseases hospitalized in the Department of Otorhinolaryngology, Xiangya Hospital from January 2018 to October 2020 were retrospectively collected. The patient's medical records were independently reviewed by 2 experienced otorhinolaryngologist and the final diagnosis was reached a consensus. A total of 499 patients (998 ears) were enrolled in this study. The 998 ears were divided into 3 groups: an MEC group (108 ears), a CSOM group (622 ears), and a normal group (268 ears). The Gaussian noise with different variances was used to amplify the samples of the dataset to offset the imbalance in the number of samples between groups. The sample size of the amplified experimental dataset was 1 806 ears. In the study, 75% (1 355) samples were randomly selected for training, 10% (180) samples for validation, and the remaining 15% (271) samples for testing and evaluating the model performance. The overall design for the model was a serial structure, and the deep learning model with 3 different functions was set up. The first model was the regional recommendation network algorithm, which searched the middle ear image from the whole HRCT image, and then cut and saved the image. The second model was image contrast convolutional neural network (CNN) based on twin network structure, which searched the images matching the key layers of HRCT images from the cut images, and constructed 3D data blocks. The third model was based on 3D-CNN operation, which was used for the final classification and diagnosis of the 3D data block construction, and gave the final prediction probability.The special level search network based on twin network structure showed an average AUC of 0.939 on 10 special levels. The overall accuracy of the classification network based on 3D-CNN was 96.5%, the overall recall rate was 96.4%, and the average AUC under the 3 classifications was 0.983. The recall rates of CSOM cases and MEC cases were 93.7% and 97.4%, respectively. In the subsequent comparison experiments, the average accuracy of some classical CNN was 79.3%, and the average recall rate was 87.6%. The precision rate and the recall rate of the deep learning network constructed in this study were about 17.2% and 8.8% higher than those of the common CNN.The deep learning network model proposed in this study can automatically extract 3D data blocks containing middle ear features from the HRCT image data of patients' temporal bone, which can reduce the overall size of the data while preserve the relationship between corresponding images, and further use 3D-CNN for classification and diagnosis of CSOM and MEC. The design of this model is well fitting to the continuous characteristics of HRCT data, and the experimental results show high precision and adaptability, which is better than the current common CNN methods.目的: 慢性化脓性中耳炎(chronic suppurative otitis media,CSOM)和中耳胆脂瘤(middle ear cholesteatoma,MEC)是两类临床上最常见的慢性中耳疾病。在诊疗过程中,该两类疾病因具有类似的临床表现,容易造成误诊及漏诊。高分辨率计算机断层扫描(high resolution computed tomography,HRCT)能清晰地显示颞骨的精细解剖结构,准确地反映中耳病变情况及病变范围,对慢性中耳疾病的鉴别诊断具有优势。本研究开发一种基于颞骨HRCT影像数据,对慢性中耳疾病实施自动信息提取与分类诊断的深度学习模型,旨在提高临床上对慢性中耳疾病的分类诊断效率,减少漏诊及误诊的发生。方法: 回顾性收集2018年1月至2020年10月于湘雅医院耳鼻咽喉科住院的慢性中耳疾病患者的临床病历及颞骨HRCT影像资料。由2名经验丰富的耳鼻咽喉科医师独立审查患者的医疗记录,并对最终诊断达成一致结论。最终纳入499例患者(998侧耳),将998侧耳分为3组:MEC组(108侧耳)、CSOM组(622侧耳)、正常组(268侧耳)。使用不同方差的高斯噪声进行数据集样本扩增处理,以此消除组间样本数量的不平衡。经扩增后的实验数据集样本量为1 806侧耳,实验中随机选择75%(1 355侧耳)用于训练,10%(180侧耳)用于验证,剩余的15%(271侧耳)用于测试并评估模型性能。模型整体设计为串联式结构,设置具有3种不同功能的深度学习模型:第一种是区域推荐网络算法,从整体HRCT图像中搜索中耳部分的图像进行切割、保存;第二种是基于孪生网络结构的图像对比卷积神经网络(convolutional neural network,CNN),从切割好的图像中搜索与HRCT图像关键层面匹配的图像,并进行3D数据块的构建与保留;第三种是基于3D-CNN操作,用于对3D数据块进行分类诊断,并给出最后的预测概率。结果: 基于孪生网络结构的特殊层面搜索网络在10个特殊层面上表现出了0.939的平均AUC值。基于3D-CNN的分类网络整体准确度为96.5%,整体召回率为96.4%,3种判类结果的平均AUC值为0.983。预测结果中的CSOM病例召回率为93.7%,MEC病例召回率为97.4%。在后续进行的对比实验上,一些经典的CNN平均精确度为79.3%,平均召回率为87.6%。本研究构建的深度学习网络的准确度比普通的CNN提升约17.2%,召回率提升约8.8%。结论: 本研究构建的深度学习网络模型可以自动从患者颞骨HRCT影像数据中提取含有中耳特征的3D数据块,在降低数据整体规模的同时保存了对应图像间的关系,可进一步使用3D-CNN进行CSOM与MEC的分类诊断。该模型的设计很好地结合了HRCT数据的连续性,实验结果准确度高,适应性广,优于目前常用的CNN方法。.