过度拟合
提前停车
计算机科学
机器学习
人工智能
集合(抽象数据类型)
训练集
特征选择
正规化(语言学)
特征(语言学)
选择(遗传算法)
数据集
数据挖掘
模式识别(心理学)
人工神经网络
语言学
哲学
程序设计语言
出处
期刊:Journal of physics
[IOP Publishing]
日期:2019-02-01
卷期号:1168: 022022-022022
被引量:1601
标识
DOI:10.1088/1742-6596/1168/2/022022
摘要
Overfitting is a fundamental issue in supervised machine learning which prevents us from perfectly generalizing the models to well fit observed data on training data, as well as unseen data on testing set. Because of the presence of noise, the limited size of training set, and the complexity of classifiers, overfitting happens. This paper is going to talk about overfitting from the perspectives of causes and solutions. To reduce the effects of overfitting, various strategies are proposed to address to these causes: 1) "early-stopping" strategy is introduced to prevent overfitting by stopping training before the performance stops optimize; 2) "network-reduction" strategy is used to exclude the noises in training set; 3) "data-expansion" strategy is proposed for complicated models to fine-tune the hyper-parameters sets with a great amount of data; and 4) "regularization" strategy is proposed to guarantee models performance to a great extent while dealing with real world issues by feature-selection, and by distinguishing more useful and less useful features.
科研通智能强力驱动
Strongly Powered by AbleSci AI