Radiomics and machine learning model can improve the differentiation between ocular adnexal lymphoma and idiopathic orbital inflammation

无线电技术 淋巴瘤 炎症 医学 病理 计算机科学 放射科 免疫学
作者
Guorong Wang,Xiaoxia Qu,Jian Guo,Yongheng Luo,Junfang Xian
出处
期刊:Chinese Medical Journal [Ovid Technologies (Wolters Kluwer)]
标识
DOI:10.1097/cm9.0000000000003356
摘要

To the Editor: Distinguishing ocular adnexal lymphoma (OAL) from idiopathic orbital inflammation (IOI) is challenging owing to their similar clinical symptoms and imaging features. Previous research has demonstrated that magnetic resonance imaging (MRI)-based radiological characteristics can offer valuable insights for distinguishing between OAL and IOI. However, the diagnostic accuracy of these imaging findings relies largely on subjective interpretation, leading to inconsistent and sometimes controversial conclusions. The integration of MRI-based radiomics with machine learning (ML) is expected to provide quantitative features in a more objective manner, thereby further establishing diagnostic models and enhancing diagnostic accuracy. OAL accounts for 10–50% of orbital malignancies in adults, with low-dose radiotherapy as the recommended initial treatment.[1,2] IOI is an inflammatory process in the orbit with uncertain causes that responds well to oral corticosteroids. Clinically, differentiating OAL from IOI is essential owing to their similar symptoms and imaging characteristics. Biopsy represents the gold standard, yet it is invasive and risky. MRI offers a non-invasive alternative, and recent studies highlight its potential in distinguishing OAL from IOI using radiomics.[3] However, these studies were single-center with single algorithms. This study aimed to develop multiparametric MRI radiomics models using T1- and T2-weighted imaging (T1WI, T2WI) and T1-weighted contrast-enhanced (T1CE) images combined with various ML algorithms to distinguish between these two entities. We also sought to identify the optimal model and test its clinical applicability with an external test set. This retrospective study was approved by the Beijing Tongren Hospital's Institutional Review Board (No. TREC2023-KY107) and registered on ClinicalTrials.gov (NCT06336499). The requirement for informed consent was waived due to its retrospective nature. We collected patients diagnosed with OAL and IOI between January 2015 and March 2022 at Beijing Tongren Hospital. Inclusion criteria: (1) patients pathologically confirmed OAL and IOI; (2) those with complete preoperative MRI data (T1WI, T2WI, and T1CE); and (3) those with clear MRI lesions. Exclusion criteria: (1) patients with severe artifacts; and (2) those with lesions smaller than 1 cm. A total of 132 OAL and 106 IOI patients from Beijing Tongren Hospital were enrolled and randomly divided into training and internal test sets (7:3 ratio). Additionally, 31 OAL and 14 IOI patients from the Second Xiangya Hospital of Central South University during the same period were included in the external test set. The details of MRI acquisition were shown in Supplementary Table 1, https://links.lww.com/CM9/C203. The regions of interest (ROIs) for OAL and IOI were manually delineated on T1WI, T2WI, and T1CE images using ITK-SNAP (version 4.0.0, developed by Penn Image Computing and Science Laboratory at the University of Pennsylvania, Philadelphia, USA, http://www.itksnap.org/) by a radiologist with 3 years of experience (Radiologist 1). These segmentations were then reviewed and adjusted by a senior radiologist with 10 years of experience (Radiologist 2). To assess intra-observer consistency, Radiologist 1 re-segmented images from 30 randomly selected patients. Visual assessments were independently conducted by two radiologists, blinded to the pathological findings. All MRI images underwent gray-level normalization (ranging from 0 to 1024) before feature extraction. Radiomics features were extracted using the FeAture Explorer software (FAE; version 0.5.8, developed by East China Normal University and Siemens Healthineers Ltd., Shanghai, China) configured with Pyradiomics (https://github.com/salan668/FAE) in this study. Overall, a total of 1688 features were extracted from each original MRI sequence image [Supplementary Table 2, https://links.lww.com/CM9/C203]. To balance the OAL and IOI sample numbers, we used the Synthetic Minority Oversampling Technique (SMOTE) to preprocess features from each MRI sequence. We investigated the best ML models for classifying OAL and IOI using multiple normalization methods, feature dimension reduction and selection approaches, and classification methods. Features were normalized using Z-score, Min-Max, and Mean. We reduced feature dimensions with Pearson correlation coefficient (PCC) and principal component analysis (PCA), removing features with PCC >0.99. Feature selection utilized methods including analysis of variance (ANOVA), Relief, recursive feature elimination (RFE), and Kruskal–Wallis (KW), selecting the number of 1–10 features from each technique. Ten ML algorithms were used for classification: logistic regression (LR), support vector machine (SVM), random forests (RF), logistic regression via Lasso (LRLasso), linear discriminant analysis (LDA), AdaBoost (AB), autoencoder (AE), naive Bayes (NB), Gaussian process (GP), and decision tree (DT). This resulted in 2400 pipelines, which were calculated as follows: 3 (normalization methods) × 2 (dimension reduction methods) × 4 (feature selection methods) × 10 (feature numbers) × 10 (classification methods) = 2400. Radiomics features from each MRI sequence were used to build models to identify OAL from IOI. We then combined T1WI, T2WI, and T1CE images to train another model for optimal determination. The workflow framework is illustrated in Figure 1.Figure 1: The schematic diagram for the multiparametric MRI-based machine learning model construction for differential diagnosis between OAL and IOI. AB: AdaBoost; AE: Autoencoder; ANOVA: Analysis of variance; AUC: Areas under the receiver operator characteristic curve; DT: Decision tree; GLCM: Gray level co-occurrence matrix; GLDM: Gray level dependence matrix; GLRLM: Gray level run length matrix; GLSZM: Gray level size zone matrix; GP: Gaussian process; ICC: Interclass correlation coefficient; IOI: Idiopathic orbital inflammation; KW: Kruskal–Wallis; LBP: Local binary pattern; LDA: Linear discriminant analysis; LR: Logistic regression; LRLasso: Logistic regression via Lasso; MRI: Magnetic resonance imaging; NB: Naive Bayes; NGTDM: Neighboring gray tone difference matrix; OAL: Ocular adnexal lymphoma; PCA: Principal component analysis; PCC: Pearson correlation coefficient; RF: Random forests; RFE: Recursive feature elimination; ROIs: Regions of interest; SVM: support vector machine; T1CE: T1-weighted contrast-enhanced; T1WI: T1-weighted imaging; T2WI: T2-weighted imaging.The t-test and chi-squared test were used for comparing continuous and categorical variables, respectively. Intra-observer consistency was evaluated using the interclass correlation coefficient (ICC). The chi-squared test compared diagnostic performance between visual assessment and ML models. Five-fold cross-validation was applied to the training set. Model performance was assessed using receiver operating characteristic (ROC) curve analysis, quantified by the area under the ROC curve (AUC). The DeLong test compared ROC curves across models. Accuracy, sensitivity, specificity, positive prediction value (PPV), and negative prediction value (NPV) were calculated at the Youden index cutoff. The 95% confidence interval (CI) was estimated via bootstrapping with 1000 replicates. Calibration was measured by the Brier score with a scale of 0–1. Analyses were conducted using FAE in Python (version 3.7.6, Python Software Foundation, 9450 SW Gemini Dr., ECM# 90772, Beaverton, OR 97008, USA) and Statistical Product and Service Solutions (SPSS, version 20.0, SPSS Inc., Chicago, USA). A P-value less than 0.05 was considered statistically significant. OAL was more common in older male patients in both the training and internal test sets (all P <0.05). In the external test set, OAL was also more common in older patients (P = 0.020), but gender distribution was not significantly different (P = 0.072). There were no significant differences in lesion side distribution across all sets (all P >0.05) [Supplementary Table 3, https://links.lww.com/CM9/C203]. The ICC values ranged from 0.815 to 0.915 (P <0.001), indicating satisfactory repeatability of feature extraction. Combining T1WI, T2WI, and T1CE to develop the differential diagnosis model, the pipeline with Mean normalization, PCA, ANOVA, and LR achieved the highest AUC. Ten features contributed to this model using the "one-standard error" rule. The AUCs were 0.921 (95% CI: 0.876–0.966), 0.900 (95% CI: 0.851–0.948), 0.849 (95% CI: 0.759–0.940), and 0.786 (95% CI: 0.653–0.918) in the training, validation, internal, and external test sets, respectively [Supplementary Table 4, Supplementary Figure 1, https://links.lww.com/CM9/C203]. These AUCs were superior to those of separate MRI sequences. The AUC values among the four models in the internal and external test sets showed no statistical significance (Delong test, all P > 0.05). However, the model combining T1WI, T2WI, and T1CE had the lowest Brier scores of 0.155 (internal test set) and 0.190 (external test set), indicating good calibration. The ML model based on multi-sequence MRI outperformed a junior radiologist and matched the performance of a senior radiologist [Supplementary Table 5, https://links.lww.com/CM9/C203]. Several prior studies have illustrated that MRI radiomics may possess the capability to differentiate OAL from IOI [Supplementary Table 6, https://links.lww.com/CM9/C203]. They were carried out at a single institution, with a relatively small sample size and a single algorithm. Therefore, the diagnostic performance needs to be further improved. The research differed from previous studies by employing a range of methods and algorithms to create 2400 processing pipelines for multiparametric MRI data. The present study found that the pipeline of optimal model configurated with Mean, PCA, ANOVA, and LR based on the combination of T1WI, T2WI, and T1CE images achieved the highest AUC of 0.849 and 0.786 in the internal and external test cohort, respectively, surpassing the previous findings. We have assessed our study using the Radiomics Quality Score (RQS),[4] achieving a score of 15. This is higher than the average RQS of 11.17 reported in a recent systematic review of ophthalmic radiomics studies.[5] The review highlighted limitations such as small sample sizes (median of 110 participants) and few studies with prospective designs or multicenter validation. Our study addresses these by including a relatively larger cohort (133 OALs and 106 IOIs) and an external validation set (31 OALs and 14 IOIs). The review also noted a lack of open data or code in many studies. In contrast, our study utilized the open-source tool FAE for radiomics analysis, making it more accessible for researchers pursuing a similar work. We acknowledge that there is a risk of overfitting due to the large number of features and relatively small sample sizes. To address this issue, we used feature selection methods such as ANOVA and feature dimension reduction methods such as PCC and PCA. In addition, we applied five-fold cross validation to ensure the robustness of model evaluation. Finally, the optimal model achieved satisfactory AUC values of 0.849 and 0.786 in the internal and external test sets, respectively, indicating promising diagnostic performance. The superior performance of the radiomics model combining T1WI, T2WI, and T1CE can be attributed to several factors: (1) The integration of multimodal MRI sequences captures a comprehensive set of imaging features, enhancing the model's ability to distinguish between OAL and IOI. (2) Advanced techniques like PCA for dimension reduction, ANOVA for feature selection, and LR for classification ensure that the most relevant and discriminative features are utilized. (3) High AUCs of 0.921, 0.900, 0.849, and 0.786 in the training, validation, internal, and external test sets indicate the model's robustness and generalizability across different datasets. Clinically, this model advances non-invasive differentiation of OAL from IOI, potentially reducing the need for biopsies and improving treatment decisions. This study has several limitations. First, due to the various histologic subtypes of OAL, future studies with larger sample sizes should conduct detailed subgroup evaluations. Second, the model relied on manual segmentation of orbital lesions, which is labor-intensive and time-consuming. Automated segmentation should be considered in future work. Third, although diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC), and dynamic contrast-enhanced (DCE) MRI can provide valuable insights for distinguishing between OAL and IOI, they were not included in this study. Finally, the retrospective design may introduce selection bias. In conclusion, this study developed an ML model using radiomics from T1WI, T2WI, and T1CE MRI data to distinguish OAL from IOI. The optimal pipeline included Mean normalization, PCA for dimension reduction, ANOVA for feature selection, and LR for classification. This method shows great promise as a valuable tool for differential diagnosis between OAL and IOI, especially for radiology residents with limited head and neck imaging experience. Funding The study was supported by National Health Commission's Capacity Building and Continuing Education Center (No. YXFSC2022JJSJ009); Beijing Municipal Administration of Hospitals' Ascent Plan (No. DFL20190203); Beijing Postdoctoral Research Foundation (No. 2023-ZZ-027); National Key R&D Program of China (No. 2022YFC2404005). Conflicts of interest None.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
铁妹儿完成签到 ,获得积分10
3秒前
Christina完成签到 ,获得积分10
10秒前
烟花应助科研通管家采纳,获得10
13秒前
科研通AI2S应助科研通管家采纳,获得10
13秒前
Polymer72应助科研通管家采纳,获得10
13秒前
爱学习完成签到 ,获得积分10
22秒前
DTL哈哈完成签到 ,获得积分10
34秒前
闪闪绝施完成签到,获得积分20
38秒前
45秒前
背完单词好睡觉完成签到 ,获得积分10
45秒前
非鱼鱼完成签到 ,获得积分10
46秒前
现实的曼安完成签到 ,获得积分10
50秒前
乐观的星月完成签到 ,获得积分10
54秒前
牧紫菱完成签到,获得积分10
58秒前
NexusExplorer应助blanche采纳,获得10
58秒前
lsy完成签到,获得积分10
59秒前
1分钟前
楚襄谷完成签到 ,获得积分10
1分钟前
花阳年华完成签到 ,获得积分10
1分钟前
琼仔仔完成签到 ,获得积分10
1分钟前
btcat完成签到,获得积分10
1分钟前
昱昱完成签到 ,获得积分10
1分钟前
Raymond完成签到,获得积分10
1分钟前
张颖完成签到 ,获得积分10
1分钟前
游01完成签到 ,获得积分10
1分钟前
gyx完成签到 ,获得积分10
1分钟前
沉静的万天完成签到 ,获得积分10
1分钟前
CHANG完成签到 ,获得积分10
1分钟前
电子屎壳郎完成签到,获得积分10
1分钟前
淡然的咖啡豆完成签到 ,获得积分10
1分钟前
Polymer72应助科研通管家采纳,获得10
2分钟前
Polymer72应助科研通管家采纳,获得10
2分钟前
烟花应助科研通管家采纳,获得10
2分钟前
Polymer72应助科研通管家采纳,获得10
2分钟前
yzxzdm完成签到 ,获得积分10
2分钟前
嗯嗯嗯哦哦哦完成签到 ,获得积分10
2分钟前
Struggle完成签到 ,获得积分10
2分钟前
mrwang完成签到 ,获得积分10
2分钟前
闪闪绝施发布了新的文献求助10
2分钟前
骄傲慕尼黑完成签到,获得积分10
2分钟前
高分求助中
Production Logging: Theoretical and Interpretive Elements 2000
Very-high-order BVD Schemes Using β-variable THINC Method 1200
BIOLOGY OF NON-CHORDATES 1000
进口的时尚——14世纪东方丝绸与意大利艺术 Imported Fashion:Oriental Silks and Italian Arts in the 14th Century 800
Autoregulatory progressive resistance exercise: linear versus a velocity-based flexible model 550
The Collected Works of Jeremy Bentham: Rights, Representation, and Reform: Nonsense upon Stilts and Other Writings on the French Revolution 320
Generative AI in Higher Education 300
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3356906
求助须知:如何正确求助?哪些是违规求助? 2980478
关于积分的说明 8694486
捐赠科研通 2662191
什么是DOI,文献DOI怎么找? 1457642
科研通“疑难数据库(出版商)”最低求助积分说明 674843
邀请新用户注册赠送积分活动 665807