计算机科学
泄漏(经济)
机器学习
人工智能
风险分析(工程)
透视图(图形)
医学
经济
宏观经济学
作者
Sharon E. Davis,Michael E. Matheny,Suresh Balu,Mark Sendak
标识
DOI:10.1093/jamia/ocad178
摘要
Abstract Introduction The pitfalls of label leakage, contamination of model input features with outcome information, are well established. Unfortunately, avoiding label leakage in clinical prediction models requires more nuance than the common advice of applying “no time machine rule.” Framework We provide a framework for contemplating whether and when model features pose leakage concerns by considering the cadence, perspective, and applicability of predictions. To ground these concepts, we use real-world clinical models to highlight examples of appropriate and inappropriate label leakage in practice. Recommendations Finally, we provide recommendations to support clinical and technical stakeholders as they evaluate the leakage tradeoffs associated with model design, development, and implementation decisions. By providing common language and dimensions to consider when designing models, we hope the clinical prediction community will be better prepared to develop statistically valid and clinically useful machine learning models.
科研通智能强力驱动
Strongly Powered by AbleSci AI