随机森林
数学
阿达布思
树(集合论)
一般化
统计
支持向量机
泛化误差
度量(数据仓库)
人工智能
模式识别(心理学)
计算机科学
人工神经网络
数据挖掘
组合数学
数学分析
出处
期刊:Machine Learning
[Springer Nature]
日期:2001-01-01
卷期号:45 (1): 5-32
被引量:95747
标识
DOI:10.1023/a:1010933404324
摘要
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
科研通智能强力驱动
Strongly Powered by AbleSci AI