作者
Julien Hédou,Ivana Marić,Grégoire Bellan,Jakob Einhaus,Dyani Gaudillière,Francois-Xavier Ladant,Franck Verdonk,Ina A. Stelzer,Dorien Feyaerts,Amy S. Tsai,Edward A. Ganio,Maximilian Sabayev,Joshua Gillard,Jonas N. Amar,Amélie Cambriel,Tomiko Oskotsky,Alennie Roldan,Jonathan L. Golob,Marina Sirota,Thomas A. Bonham,M Sato,Maïgane Diop,Xavier Durand,Martin S. Angst,David K. Stevenson,Nima Aghaeepour,Andrea Montanari,Brice Gaudillière
摘要
Abstract Adoption of high-content omic technologies in clinical studies, coupled with computational methods, has yielded an abundance of candidate biomarkers. However, translating such findings into bona fide clinical biomarkers remains challenging. To facilitate this process, we introduce Stabl, a general machine learning method that identifies a sparse, reliable set of biomarkers by integrating noise injection and a data-driven signal-to-noise threshold into multivariable predictive modeling. Evaluation of Stabl on synthetic datasets and five independent clinical studies demonstrates improved biomarker sparsity and reliability compared to commonly used sparsity-promoting regularization methods while maintaining predictive performance; it distills datasets containing 1,400–35,000 features down to 4–34 candidate biomarkers. Stabl extends to multi-omic integration tasks, enabling biological interpretation of complex predictive models, as it hones in on a shortlist of proteomic, metabolomic and cytometric events predicting labor onset, microbial biomarkers of pre-term birth and a pre-operative immune signature of post-surgical infections. Stabl is available at https://github.com/gregbellan/Stabl .