作者
Lars Johannes Isaksson,Marco Repetto,Paul Summers,Matteo Pepa,Mattia Zaffaroni,Maria Giulia Vincini,Giulia Corrao,Giovanni Mazzola,Marco Rotondi,Federica Bellerba,Sara Raimondi,Zaharudin Haron,Sarah Alessi,Paula Pricolo,Francesco A. Mistretta,Stefano Luzzago,Federica Cattani,Gennaro Musi,Ottavio De Cobelli,Marta Cremonesi,Roberto Orecchia,Davide La Torre,Giulia Marvaso,Giuseppe Petralia,Barbara Alicja Jereczek‐Fossa
摘要
When researchers are faced with building machine learning (ML) radiomic models, the first choice they have to make is what model to use. Naturally, the goal is to use the model with the best performance. But what is the best model? It is well known in ML that modern techniques such as gradient boosting and deep learning have better capacity than traditional models to solve complex problems in high dimensions. Despite this, most radiomics researchers still do not focus on these models in their research. As access to high-quality and large data sets increase, these high-capacity ML models may become even more relevant. In this article, we use a large dataset of 949 prostate cancer patients to compare the performance of a few of the most promising ML models for tabular data: gradient-boosted decision trees (GBDTs), multilayer perceptions, convolutional neural networks, and transformers. To this end, we predict nine different prostate cancer pathology outcomes of clinical interest. Our goal is to give a rough overview of how these models compare against one another in a typical radiomics setting. We also investigate if multitask learning improves the performance of these models when multiple targets are available. Our results suggest that GBDTs perform well across all targets, and that multitask learning does not provide a consistent improvement.