作者
Zhengsong Pan,Ge Hu,Zhenchen Zhu,Weixiong Tan,Wei Han,Z.‐G. Zhou,Wei Song,Yizhou Yu,Lan Song,Zhengyu Jin
摘要
Background Preoperative discrimination of preinvasive, minimally invasive, and invasive adenocarcinoma at CT informs clinical management decisions but may be challenging for classifying pure ground-glass nodules (pGGNs). Deep learning (DL) may improve ternary classification. Purpose To determine whether a strategy that includes an adjudication approach can enhance the performance of DL ternary classification models in predicting the invasiveness of adenocarcinoma at chest CT and maintain performance in classifying pGGNs. Materials and Methods In this retrospective study, six ternary models for classifying preinvasive, minimally invasive, and invasive adenocarcinoma were developed using a multicenter data set of lung nodules. The DL-based models were progressively modified through framework optimization, joint learning, and an adjudication strategy (simulating a multireader approach to resolving discordant nodule classifications), integrating two binary classification models with a ternary classification model to resolve discordant classifications sequentially. The six ternary models were then tested on an external data set of pGGNs imaged between December 2019 and January 2021. Diagnostic performance including accuracy, specificity, and sensitivity was assessed. The χ2 test was used to compare model performance in different subgroups stratified by clinical confounders. Results A total of 4929 nodules from 4483 patients (mean age, 50.1 years ± 9.5 [SD]; 2806 female) were divided into training (n = 3384), validation (n = 579), and internal (n = 966) test sets. A total of 361 pGGNs from 281 patients (mean age, 55.2 years ± 11.1 [SD]; 186 female) formed the external test set. The proposed strategy improved DL model performance in external testing (P < .001). For classifying minimally invasive adenocarcinoma, the accuracy was 85% and 79%, sensitivity was 75% and 63%, and specificity was 89% and 85% for the model with adjudication (model 6) and the model without (model 3), respectively. Model 6 showed a relatively narrow range (maximum minus minimum) across diagnostic indexes (accuracy, 1.7%; sensitivity, 7.3%; specificity, 0.9%) compared with the other models (accuracy, 0.6%–10.8%; sensitivity, 14%–39.1%; specificity, 5.5%–17.9%). Conclusion Combining framework optimization, joint learning, and an adjudication approach improved DL classification of adenocarcinoma invasiveness at chest CT. Published under a CC BY 4.0 license. Supplemental material is available for this article. See also the editorial by Sohn and Fields in this issue.