过度拟合
计算机科学
超参数
机器学习
阿卡克信息准则
选型
人工智能
最大熵原理
一套
特征选择
数据挖掘
人工神经网络
历史
考古
作者
Sergio Vignali,Arnaud Barras,Raphaël Arlettaz,Veronika Braunisch
摘要
Abstract Balancing model complexity is a key challenge of modern computational ecology, particularly so since the spread of machine learning algorithms. Species distribution models are often implemented using a wide variety of machine learning algorithms that can be fine‐tuned to achieve the best model prediction while avoiding overfitting. We have released SDMtune , a new R package that aims to facilitate training, tuning, and evaluation of species distribution models in a unified framework. The main innovations of this package are its functions to perform data‐driven variable selection, and a novel genetic algorithm to tune model hyperparameters. Real‐time and interactive charts are displayed during the execution of several functions to help users understand the effect of removing a variable or varying model hyperparameters on model performance. SDMtune supports three different metrics to evaluate model performance: the area under the receiver operating characteristic curve, the true skill statistic, and Akaike's information criterion corrected for small sample sizes. It implements four statistical methods: artificial neural networks, boosted regression trees, maximum entropy modeling, and random forest. Moreover, it includes functions to display the outputs and create a final report. SDMtune therefore represents a new, unified and user‐friendly framework for the still‐growing field of species distribution modeling.
科研通智能强力驱动
Strongly Powered by AbleSci AI