机器学习
集成学习
计算机科学
贝叶斯优化
超参数
人工智能
一般化
成对比较
贝叶斯概率
数学
数学分析
作者
Pranav Poduval,Sanjay Kumar Patnala,Gaurav Oberoi,Nitish Srivasatava,Siddhartha Asthana
标识
DOI:10.1145/3637528.3671894
摘要
The Combined Algorithm Selection and Hyperparameter Optimization (CASH) problem is pivotal in Automatic Machine Learning (AutoML). Most leading approaches combine Bayesian optimization with post-hoc ensemble building to create advanced AutoML systems. Bayesian optimization (BO) typically focuses on identifying a singular algorithm and its hyperparameters that outperform all other configurations. Recent developments have highlighted an oversight in prior CASH methods: the lack of consideration for diversity among the base learners of the ensemble. This oversight was overcome by explicitly injecting the search for diversity into the traditional CASH problem. However, despite recent developments, BO's limitation lies in its inability to directly optimize ensemble generalization error, offering no theoretical assurance that increased diversity correlates with enhanced ensemble performance. Our research addresses this gap by establishing a theoretical foundation that integrates diversity into the core of BO for direct ensemble learning. We explore a theoretically sound framework that describes the relationship between pair-wise diversity and ensemble performance, which allows our Bayesian optimization framework Optimal Diversity Bayesian Optimization (OptDivBO) to directly and efficiently minimize ensemble generalization error. OptDivBO guarantees an optimal balance between pairwise diversity and individual model performance, setting a new precedent in ensemble learning within CASH. Empirical results on 20 public datasets show that OptDivBO achieves the best average test ranks of 1.57 and 1.4 in classification and regression tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI