Radiomics and machine learning model can improve the differentiation between ocular adnexal lymphoma and idiopathic orbital inflammation

无线电技术 淋巴瘤 炎症 医学 病理 计算机科学 放射科 免疫学
作者
Guorong Wang,Xiaoxia Qu,Jian Guo,Yongheng Luo,Junfang Xian
出处
期刊:Chinese Medical Journal [Ovid Technologies (Wolters Kluwer)]
标识
DOI:10.1097/cm9.0000000000003356
摘要

To the Editor: Distinguishing ocular adnexal lymphoma (OAL) from idiopathic orbital inflammation (IOI) is challenging owing to their similar clinical symptoms and imaging features. Previous research has demonstrated that magnetic resonance imaging (MRI)-based radiological characteristics can offer valuable insights for distinguishing between OAL and IOI. However, the diagnostic accuracy of these imaging findings relies largely on subjective interpretation, leading to inconsistent and sometimes controversial conclusions. The integration of MRI-based radiomics with machine learning (ML) is expected to provide quantitative features in a more objective manner, thereby further establishing diagnostic models and enhancing diagnostic accuracy. OAL accounts for 10–50% of orbital malignancies in adults, with low-dose radiotherapy as the recommended initial treatment.[1,2] IOI is an inflammatory process in the orbit with uncertain causes that responds well to oral corticosteroids. Clinically, differentiating OAL from IOI is essential owing to their similar symptoms and imaging characteristics. Biopsy represents the gold standard, yet it is invasive and risky. MRI offers a non-invasive alternative, and recent studies highlight its potential in distinguishing OAL from IOI using radiomics.[3] However, these studies were single-center with single algorithms. This study aimed to develop multiparametric MRI radiomics models using T1- and T2-weighted imaging (T1WI, T2WI) and T1-weighted contrast-enhanced (T1CE) images combined with various ML algorithms to distinguish between these two entities. We also sought to identify the optimal model and test its clinical applicability with an external test set. This retrospective study was approved by the Beijing Tongren Hospital's Institutional Review Board (No. TREC2023-KY107) and registered on ClinicalTrials.gov (NCT06336499). The requirement for informed consent was waived due to its retrospective nature. We collected patients diagnosed with OAL and IOI between January 2015 and March 2022 at Beijing Tongren Hospital. Inclusion criteria: (1) patients pathologically confirmed OAL and IOI; (2) those with complete preoperative MRI data (T1WI, T2WI, and T1CE); and (3) those with clear MRI lesions. Exclusion criteria: (1) patients with severe artifacts; and (2) those with lesions smaller than 1 cm. A total of 132 OAL and 106 IOI patients from Beijing Tongren Hospital were enrolled and randomly divided into training and internal test sets (7:3 ratio). Additionally, 31 OAL and 14 IOI patients from the Second Xiangya Hospital of Central South University during the same period were included in the external test set. The details of MRI acquisition were shown in Supplementary Table 1, https://links.lww.com/CM9/C203. The regions of interest (ROIs) for OAL and IOI were manually delineated on T1WI, T2WI, and T1CE images using ITK-SNAP (version 4.0.0, developed by Penn Image Computing and Science Laboratory at the University of Pennsylvania, Philadelphia, USA, http://www.itksnap.org/) by a radiologist with 3 years of experience (Radiologist 1). These segmentations were then reviewed and adjusted by a senior radiologist with 10 years of experience (Radiologist 2). To assess intra-observer consistency, Radiologist 1 re-segmented images from 30 randomly selected patients. Visual assessments were independently conducted by two radiologists, blinded to the pathological findings. All MRI images underwent gray-level normalization (ranging from 0 to 1024) before feature extraction. Radiomics features were extracted using the FeAture Explorer software (FAE; version 0.5.8, developed by East China Normal University and Siemens Healthineers Ltd., Shanghai, China) configured with Pyradiomics (https://github.com/salan668/FAE) in this study. Overall, a total of 1688 features were extracted from each original MRI sequence image [Supplementary Table 2, https://links.lww.com/CM9/C203]. To balance the OAL and IOI sample numbers, we used the Synthetic Minority Oversampling Technique (SMOTE) to preprocess features from each MRI sequence. We investigated the best ML models for classifying OAL and IOI using multiple normalization methods, feature dimension reduction and selection approaches, and classification methods. Features were normalized using Z-score, Min-Max, and Mean. We reduced feature dimensions with Pearson correlation coefficient (PCC) and principal component analysis (PCA), removing features with PCC >0.99. Feature selection utilized methods including analysis of variance (ANOVA), Relief, recursive feature elimination (RFE), and Kruskal–Wallis (KW), selecting the number of 1–10 features from each technique. Ten ML algorithms were used for classification: logistic regression (LR), support vector machine (SVM), random forests (RF), logistic regression via Lasso (LRLasso), linear discriminant analysis (LDA), AdaBoost (AB), autoencoder (AE), naive Bayes (NB), Gaussian process (GP), and decision tree (DT). This resulted in 2400 pipelines, which were calculated as follows: 3 (normalization methods) × 2 (dimension reduction methods) × 4 (feature selection methods) × 10 (feature numbers) × 10 (classification methods) = 2400. Radiomics features from each MRI sequence were used to build models to identify OAL from IOI. We then combined T1WI, T2WI, and T1CE images to train another model for optimal determination. The workflow framework is illustrated in Figure 1.Figure 1: The schematic diagram for the multiparametric MRI-based machine learning model construction for differential diagnosis between OAL and IOI. AB: AdaBoost; AE: Autoencoder; ANOVA: Analysis of variance; AUC: Areas under the receiver operator characteristic curve; DT: Decision tree; GLCM: Gray level co-occurrence matrix; GLDM: Gray level dependence matrix; GLRLM: Gray level run length matrix; GLSZM: Gray level size zone matrix; GP: Gaussian process; ICC: Interclass correlation coefficient; IOI: Idiopathic orbital inflammation; KW: Kruskal–Wallis; LBP: Local binary pattern; LDA: Linear discriminant analysis; LR: Logistic regression; LRLasso: Logistic regression via Lasso; MRI: Magnetic resonance imaging; NB: Naive Bayes; NGTDM: Neighboring gray tone difference matrix; OAL: Ocular adnexal lymphoma; PCA: Principal component analysis; PCC: Pearson correlation coefficient; RF: Random forests; RFE: Recursive feature elimination; ROIs: Regions of interest; SVM: support vector machine; T1CE: T1-weighted contrast-enhanced; T1WI: T1-weighted imaging; T2WI: T2-weighted imaging.The t-test and chi-squared test were used for comparing continuous and categorical variables, respectively. Intra-observer consistency was evaluated using the interclass correlation coefficient (ICC). The chi-squared test compared diagnostic performance between visual assessment and ML models. Five-fold cross-validation was applied to the training set. Model performance was assessed using receiver operating characteristic (ROC) curve analysis, quantified by the area under the ROC curve (AUC). The DeLong test compared ROC curves across models. Accuracy, sensitivity, specificity, positive prediction value (PPV), and negative prediction value (NPV) were calculated at the Youden index cutoff. The 95% confidence interval (CI) was estimated via bootstrapping with 1000 replicates. Calibration was measured by the Brier score with a scale of 0–1. Analyses were conducted using FAE in Python (version 3.7.6, Python Software Foundation, 9450 SW Gemini Dr., ECM# 90772, Beaverton, OR 97008, USA) and Statistical Product and Service Solutions (SPSS, version 20.0, SPSS Inc., Chicago, USA). A P-value less than 0.05 was considered statistically significant. OAL was more common in older male patients in both the training and internal test sets (all P <0.05). In the external test set, OAL was also more common in older patients (P = 0.020), but gender distribution was not significantly different (P = 0.072). There were no significant differences in lesion side distribution across all sets (all P >0.05) [Supplementary Table 3, https://links.lww.com/CM9/C203]. The ICC values ranged from 0.815 to 0.915 (P <0.001), indicating satisfactory repeatability of feature extraction. Combining T1WI, T2WI, and T1CE to develop the differential diagnosis model, the pipeline with Mean normalization, PCA, ANOVA, and LR achieved the highest AUC. Ten features contributed to this model using the "one-standard error" rule. The AUCs were 0.921 (95% CI: 0.876–0.966), 0.900 (95% CI: 0.851–0.948), 0.849 (95% CI: 0.759–0.940), and 0.786 (95% CI: 0.653–0.918) in the training, validation, internal, and external test sets, respectively [Supplementary Table 4, Supplementary Figure 1, https://links.lww.com/CM9/C203]. These AUCs were superior to those of separate MRI sequences. The AUC values among the four models in the internal and external test sets showed no statistical significance (Delong test, all P > 0.05). However, the model combining T1WI, T2WI, and T1CE had the lowest Brier scores of 0.155 (internal test set) and 0.190 (external test set), indicating good calibration. The ML model based on multi-sequence MRI outperformed a junior radiologist and matched the performance of a senior radiologist [Supplementary Table 5, https://links.lww.com/CM9/C203]. Several prior studies have illustrated that MRI radiomics may possess the capability to differentiate OAL from IOI [Supplementary Table 6, https://links.lww.com/CM9/C203]. They were carried out at a single institution, with a relatively small sample size and a single algorithm. Therefore, the diagnostic performance needs to be further improved. The research differed from previous studies by employing a range of methods and algorithms to create 2400 processing pipelines for multiparametric MRI data. The present study found that the pipeline of optimal model configurated with Mean, PCA, ANOVA, and LR based on the combination of T1WI, T2WI, and T1CE images achieved the highest AUC of 0.849 and 0.786 in the internal and external test cohort, respectively, surpassing the previous findings. We have assessed our study using the Radiomics Quality Score (RQS),[4] achieving a score of 15. This is higher than the average RQS of 11.17 reported in a recent systematic review of ophthalmic radiomics studies.[5] The review highlighted limitations such as small sample sizes (median of 110 participants) and few studies with prospective designs or multicenter validation. Our study addresses these by including a relatively larger cohort (133 OALs and 106 IOIs) and an external validation set (31 OALs and 14 IOIs). The review also noted a lack of open data or code in many studies. In contrast, our study utilized the open-source tool FAE for radiomics analysis, making it more accessible for researchers pursuing a similar work. We acknowledge that there is a risk of overfitting due to the large number of features and relatively small sample sizes. To address this issue, we used feature selection methods such as ANOVA and feature dimension reduction methods such as PCC and PCA. In addition, we applied five-fold cross validation to ensure the robustness of model evaluation. Finally, the optimal model achieved satisfactory AUC values of 0.849 and 0.786 in the internal and external test sets, respectively, indicating promising diagnostic performance. The superior performance of the radiomics model combining T1WI, T2WI, and T1CE can be attributed to several factors: (1) The integration of multimodal MRI sequences captures a comprehensive set of imaging features, enhancing the model's ability to distinguish between OAL and IOI. (2) Advanced techniques like PCA for dimension reduction, ANOVA for feature selection, and LR for classification ensure that the most relevant and discriminative features are utilized. (3) High AUCs of 0.921, 0.900, 0.849, and 0.786 in the training, validation, internal, and external test sets indicate the model's robustness and generalizability across different datasets. Clinically, this model advances non-invasive differentiation of OAL from IOI, potentially reducing the need for biopsies and improving treatment decisions. This study has several limitations. First, due to the various histologic subtypes of OAL, future studies with larger sample sizes should conduct detailed subgroup evaluations. Second, the model relied on manual segmentation of orbital lesions, which is labor-intensive and time-consuming. Automated segmentation should be considered in future work. Third, although diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC), and dynamic contrast-enhanced (DCE) MRI can provide valuable insights for distinguishing between OAL and IOI, they were not included in this study. Finally, the retrospective design may introduce selection bias. In conclusion, this study developed an ML model using radiomics from T1WI, T2WI, and T1CE MRI data to distinguish OAL from IOI. The optimal pipeline included Mean normalization, PCA for dimension reduction, ANOVA for feature selection, and LR for classification. This method shows great promise as a valuable tool for differential diagnosis between OAL and IOI, especially for radiology residents with limited head and neck imaging experience. Funding The study was supported by National Health Commission's Capacity Building and Continuing Education Center (No. YXFSC2022JJSJ009); Beijing Municipal Administration of Hospitals' Ascent Plan (No. DFL20190203); Beijing Postdoctoral Research Foundation (No. 2023-ZZ-027); National Key R&D Program of China (No. 2022YFC2404005). Conflicts of interest None.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
1秒前
xuanxuan发布了新的文献求助10
1秒前
keigo发布了新的文献求助10
1秒前
xqwwqx发布了新的文献求助10
1秒前
fay完成签到,获得积分10
2秒前
毛儿豆儿完成签到,获得积分10
2秒前
马铃薯发布了新的文献求助10
2秒前
帅玉玉发布了新的文献求助10
2秒前
MADKAI发布了新的文献求助10
2秒前
老詹头完成签到,获得积分10
2秒前
3秒前
鲸落完成签到,获得积分10
3秒前
erfc完成签到,获得积分10
3秒前
ezreal完成签到,获得积分10
4秒前
sll发布了新的文献求助20
4秒前
Ava应助liyi采纳,获得10
4秒前
FFFFFFF应助圈圈采纳,获得10
4秒前
4秒前
JUll完成签到,获得积分10
5秒前
6秒前
aurora发布了新的文献求助10
6秒前
七七发布了新的文献求助10
6秒前
八九发布了新的文献求助50
6秒前
MeiLing完成签到,获得积分10
6秒前
Hello应助小柠檬采纳,获得10
6秒前
www发布了新的文献求助10
6秒前
老詹头发布了新的文献求助10
7秒前
心房子完成签到,获得积分10
7秒前
7秒前
8秒前
li发布了新的文献求助10
8秒前
SciGPT应助大白采纳,获得10
9秒前
大吴克发布了新的文献求助10
9秒前
pcm完成签到,获得积分10
9秒前
彭于晏应助Ssyong采纳,获得10
9秒前
CC发布了新的文献求助10
9秒前
宇少爱学习哟完成签到,获得积分10
10秒前
10秒前
Amber应助曹梦梦采纳,获得10
10秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Social media impact on athlete mental health: #RealityCheck 1020
Ensartinib (Ensacove) for Non-Small Cell Lung Cancer 1000
Unseen Mendieta: The Unpublished Works of Ana Mendieta 1000
Bacterial collagenases and their clinical applications 800
El viaje de una vida: Memorias de María Lecea 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3527469
求助须知:如何正确求助?哪些是违规求助? 3107497
关于积分的说明 9285892
捐赠科研通 2805298
什么是DOI,文献DOI怎么找? 1539865
邀请新用户注册赠送积分活动 716714
科研通“疑难数据库(出版商)”最低求助积分说明 709678