随机森林
计算机科学
比例危险模型
随机效应模型
回归
构造(python库)
大数据
统计
树(集合论)
林地
预测能力
数据挖掘
机器学习
人工智能
计量经济学
数学
医学
哲学
数学分析
内科学
程序设计语言
认识论
荟萃分析
出处
期刊:Journal of insurance medicine
[American Academy of Insurance Medicine]
日期:2017-01-01
卷期号:47 (1): 31-39
被引量:518
标识
DOI:10.17849/insm-47-01-31-39.1
摘要
For the task of analyzing survival data to derive risk factors associated with mortality, physicians, researchers, and biostatisticians have typically relied on certain types of regression techniques, most notably the Cox model. With the advent of more widely distributed computing power, methods which require more complex mathematics have become increasingly common. Particularly in this era of "big data" and machine learning, survival analysis has become methodologically broader. This paper aims to explore one technique known as Random Forest. The Random Forest technique is a regression tree technique which uses bootstrap aggregation and randomization of predictors to achieve a high degree of predictive accuracy. The various input parameters of the random forest are explored. Colon cancer data (n = 66,807) from the SEER database is then used to construct both a Cox model and a random forest model to determine how well the models perform on the same data. Both models perform well, achieving a concordance error rate of approximately 18%.
科研通智能强力驱动
Strongly Powered by AbleSci AI