Random search for hyper-parameter optimization

超参数优化随机搜索计算机科学网格集合（抽象数据类型）人工神经网络分数（化学）搜索算法数据挖掘人工智能机器学习算法数学几何学支持向量机有机化学化学程序设计语言

作者

James Bergstra,Yoshua Bengio

出处

期刊：Journal of Machine Learning Research [The MIT Press]
日期：2012-03-01 卷期号：13 (1): 281-305 被引量：1268

链接

摘要

Grid search and manual search are the most widely used strategies for hyper-parameter optimization. This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid. Empirical evidence comes from a comparison with a large previous study that used grid search and manual search to configure neural networks and deep belief networks. Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as good or better within a small fraction of the computation time. Granting random search the same computational budget, random search finds better models by effectively searching a larger, less promising configuration space. Compared with deep belief networks configured by a thoughtful combination of manual search and grid search, purely random search over the same 32-dimensional configuration space found statistically equal performance on four of seven data sets, and superior performance on one of seven. A Gaussian process analysis of the function from hyper-parameters to validation set performance reveals that for most data sets only a few of the hyper-parameters really matter, but that different hyper-parameters are important on different data sets. This phenomenon makes grid search a poor choice for configuring algorithms for new data sets. Our analysis casts some light on why recent High Throughput methods achieve surprising success--they appear to search through a large number of hyper-parameters because most hyper-parameters do not matter much. We anticipate that growing interest in large hierarchical models will place an increasing burden on techniques for hyper-parameter optimization; this work shows that random search is a natural baseline against which to judge progress in the development of adaptive (sequential) hyper-parameter optimization algorithms.

求助该文献

最长约 10秒，即可获得该文献文件

Random search for hyper-parameter optimization

今日热心研友