A working guide to boosted regression trees

Boosting(机器学习) 回归 计算机科学 决策树 离群值 机器学习 回归分析 树(集合论) 统计模型 人工智能 多元自适应回归样条 线性回归 简单线性回归 预测建模 数据挖掘 统计 数学 多项式回归 数学分析
作者
Jane Elith,John R. Leathwick,Trevor Hastie
出处
期刊:Journal of Animal Ecology [Wiley]
卷期号:77 (4): 802-813 被引量:6032
标识
DOI:10.1111/j.1365-2656.2008.01390.x
摘要

1 Ecologists use statistical models for both explanation and prediction, and need techniques that are flexible enough to express typical features of their data, such as nonlinearities and interactions. 2 This study provides a working guide to boosted regression trees (BRT), an ensemble method for fitting statistical models that differs fundamentally from conventional techniques that aim to fit a single parsimonious model. Boosted regression trees combine the strengths of two algorithms: regression trees (models that relate a response to their predictors by recursive binary splits) and boosting (an adaptive method for combining many simple models to give improved predictive performance). The final BRT model can be understood as an additive regression model in which individual terms are simple trees, fitted in a forward, stagewise fashion. 3 Boosted regression trees incorporate important advantages of tree-based methods, handling different types of predictor variables and accommodating missing data. They have no need for prior data transformation or elimination of outliers, can fit complex nonlinear relationships, and automatically handle interaction effects between predictors. Fitting multiple trees in BRT overcomes the biggest drawback of single tree models: their relatively poor predictive performance. Although BRT models are complex, they can be summarized in ways that give powerful ecological insight, and their predictive performance is superior to most traditional modelling methods. 4 The unique features of BRT raise a number of practical issues in model fitting. We demonstrate the practicalities and advantages of using BRT through a distributional analysis of the short-finned eel (Anguilla australis Richardson), a native freshwater fish of New Zealand. We use a data set of over 13 000 sites to illustrate effects of several settings, and then fit and interpret a model using a subset of the data. We provide code and a tutorial to enable the wider use of BRT by ecologists.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
搞怪元彤发布了新的文献求助10
3秒前
刘汉卿发布了新的文献求助10
3秒前
3秒前
韩soso完成签到,获得积分10
4秒前
李木森发布了新的文献求助10
4秒前
tomorrow完成签到,获得积分10
4秒前
5秒前
熊熊发布了新的文献求助10
8秒前
9秒前
123完成签到,获得积分10
9秒前
脑洞疼应助刘汉卿采纳,获得10
9秒前
11秒前
假装失忆发布了新的文献求助10
12秒前
铁盐君完成签到,获得积分10
12秒前
7788完成签到,获得积分10
13秒前
Akim应助FG采纳,获得10
14秒前
陈静发布了新的文献求助10
15秒前
铁盐君发布了新的文献求助10
15秒前
15秒前
温暖访枫发布了新的文献求助10
16秒前
17秒前
17秒前
19秒前
Charlie发布了新的文献求助10
21秒前
22秒前
大力大神发布了新的文献求助10
23秒前
23秒前
23秒前
23秒前
852应助科研通管家采纳,获得10
23秒前
23秒前
Singularity应助科研通管家采纳,获得10
23秒前
23秒前
陈静完成签到,获得积分10
24秒前
wanci应助科研通管家采纳,获得10
24秒前
24秒前
Akim应助科研通管家采纳,获得10
24秒前
24秒前
24秒前
高分求助中
Adhesion Science: Principles & Practice 1234
Signals, Systems, and Signal Processing 610
The Resilient Mindset 400
Impact of Storage Orientation and Duration on Prefilled Syringe Performance: Break-Loose and Glide Forces, and Injection Time Across Multiple Time Points 360
Programming for Chemical Engineers Using C, C++, and MATLAB 300
Upland Kenya wild flowers and ferns: a flora of the flowers, ferns, grasses, and sedges of highland Kenya 300
Disturbing the Quiet Life? Competition and CEO Incentives 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6652133
求助须知:如何正确求助?哪些是违规求助? 8406136
关于积分的说明 17974511
捐赠科研通 5847387
什么是DOI,文献DOI怎么找? 2971625
邀请新用户注册赠送积分活动 1947063
关于科研通互助平台的介绍 1867509