作者
Teppei Konishi,Mateusz Grynkiewicz,Keita Saito,Takuma Kobayashi,Akiteru Goto,Michinobu Umakoshi,Takashi Iwata,Hiroshi Nishio,Yuki Katoh,Tomonobu Fujita,Tomoya Matsui,Masaki Sugawara,Hiroyuki Sano
摘要
1549 Background: The presence of genetic mutations is a vital prognostic in many types of cancer. However, genomic testing is expensive and challenging to perform. In contrast, hematoxylin and eosin (H&E) staining is relatively inexpensive and straightforward. Thus, in this study, we propose a method of predicting the presence of genetic mutations using H&E-stained whole-slide images (WSIs). Methods: We divided each H&E–stained WSI into small pieces or “patches.” We used a deep learning model to classify each patch based on the presence of tumor-containing regions. We then extracted image features from each tumor-containing patch using a deep learning-based feature extractor. We created image features for the entire WSI by concatenating the features of the patches. We then trained genetic mutation classification models using the WSI features as the input and the presence or absence of genetic mutations as the output. Finally, we evaluated the performance of these models using the area under the receiver operating characteristic curve (AUC). Results: First, we evaluated our methods using The Cancer Genome Atlas (TCGA) colorectal cancer dataset. We used H&E–stained WSIs and data associated with Microsatellite Instability ( MSI) and BRAF gene mutations, which are directly relevant to therapeutic strategies, obtained from an independent clinical cohort of 566 patients with TCGA colon and rectum adenocarcinoma. We divided the data into training, validation, and test splits, comprising 367, 90, and 109 patients, respectively. We used the training and validation splits for model training and selection, and the test split for model evaluation. The AUC values of the classification models and associated 95% confidence intervals (CIs) were 0.721 (CI = 0.572–0.870) for MSI and 0.712 (CI = 0.547–0.877) for BRAF gene mutations. We also applied our approach to MUC16, KRAS, and ALK mutations using the TCGA lung cancer dataset. We divided 909 TCGA lung adenocarcinoma and lung squamous cell carcinoma patients into training, validation, and test splits, comprising 582, 146, and 181 patients, respectively. In contrast with those of the colorectal dataset, WSI image features were generated using all patches. The AUC values on the test splits were 0.897 (CI = 0.85–0.95) for MUC16, 0.845 (CI = 0.75–0.94) for KRAS, and 0.756 (CI = 0.57–0.94) for ALK mutations. Conclusions: We proposed an approach to predict the presence of genetic mutations using only H&E–stained WSIs and evaluated its performance using colorectal and lung cancer datasets. Our model has the potential to predict the presence of certain genetic mutations with superior performance. These predictions can be used to improve the accuracy of prognostic prediction using WSIs alone.