作者
ANETTE PAULINA VISTOSO MONREAL,Nicolás Veas,Kyle Jones,Alessandro Villa
摘要
Objective Predictors of malignant transformation (MT) of oral leukoplakia (OL) are poorly defined. Machine learning (ML) has shown improvements in decision-making processes both in medicine and dentistry. The aim of this study was to model, implement, and evaluate a series of ML algorithms to predict the MT of OL and select the model with the best performance for future use in clinical settings. Methods A retrospective search using the Patient Explorer software was conducted to identify all patients with OL seen at the University of California San Francisco (June 2013-November 2021). Patient demographics, smoking status, date of diagnosis of OL, presence of dysplasia, and cancer were recorded. Exploratory data analysis and labeling were conducted with Python. ML modeling was performed with variables transformed into categorical ones. Twenty machine learning algorithms were applied to the prepared dataset (70 % for training, and 30% for testing), and Recall, F-1, and importance score (IS) were calculated. Comparison of the performance of the algorithms determined the final selected modeling of the malignant transformation of OL. Results A total of 703 patients with OL were included (45% females; 84% White or Caucasian). The median age at OL diagnosis was 60 years. 57% were never smokers, and 64 % were partnered. 58 patients developed oral cancer (OC) within the first year after the OL diagnosis. The histological diagnosis of OL was present for 15 % of patients and showed dysplasia before an oral cancer diagnosis. The ML algorithms with the best performance based on the specificity and accuracy were Tuned Random Forest and Tuned Decision Tree. Both models showed approximately 60% of accuracy in the prediction of OC in patients with OL with a recall (specificity) > 80%. The variables with higher IS were "age < 40 years”; “Former smoker”; “being White or Caucasian”, Never smoker, “presence of leukoplakias at other sites”, “marital status (partnered)”, and “history of oral dysplasia”. Conclusions Artificial intelligence has the potential to improve the prediction of MT of OL. Future studies should incorporate histopathology data and detailed descriptors of the oral site affected to improve the accuracy of the model. Predictors of malignant transformation (MT) of oral leukoplakia (OL) are poorly defined. Machine learning (ML) has shown improvements in decision-making processes both in medicine and dentistry. The aim of this study was to model, implement, and evaluate a series of ML algorithms to predict the MT of OL and select the model with the best performance for future use in clinical settings. A retrospective search using the Patient Explorer software was conducted to identify all patients with OL seen at the University of California San Francisco (June 2013-November 2021). Patient demographics, smoking status, date of diagnosis of OL, presence of dysplasia, and cancer were recorded. Exploratory data analysis and labeling were conducted with Python. ML modeling was performed with variables transformed into categorical ones. Twenty machine learning algorithms were applied to the prepared dataset (70 % for training, and 30% for testing), and Recall, F-1, and importance score (IS) were calculated. Comparison of the performance of the algorithms determined the final selected modeling of the malignant transformation of OL. A total of 703 patients with OL were included (45% females; 84% White or Caucasian). The median age at OL diagnosis was 60 years. 57% were never smokers, and 64 % were partnered. 58 patients developed oral cancer (OC) within the first year after the OL diagnosis. The histological diagnosis of OL was present for 15 % of patients and showed dysplasia before an oral cancer diagnosis. The ML algorithms with the best performance based on the specificity and accuracy were Tuned Random Forest and Tuned Decision Tree. Both models showed approximately 60% of accuracy in the prediction of OC in patients with OL with a recall (specificity) > 80%. The variables with higher IS were "age < 40 years”; “Former smoker”; “being White or Caucasian”, Never smoker, “presence of leukoplakias at other sites”, “marital status (partnered)”, and “history of oral dysplasia”. Artificial intelligence has the potential to improve the prediction of MT of OL. Future studies should incorporate histopathology data and detailed descriptors of the oral site affected to improve the accuracy of the model.