数学优化
最优化问题
杠杆(统计)
整数(计算机科学)
计算机科学
选择(遗传算法)
整数规划
数学
人工智能
程序设计语言
作者
Andrés Gómez,Oleg A. Prokopyev
出处
期刊:Informs Journal on Computing
日期:2021-03-12
被引量:6
标识
DOI:10.1287/ijoc.2020.1031
摘要
We consider the best subset selection problem in linear regression—that is, finding a parsimonious subset of the regression variables that provides the best fit to the data according to some predefined criterion. We are primarily concerned with alternatives to cross-validation methods that do not require data partitioning and involve a range of information criteria extensively studied in the statistical literature. We show that the problem of interest can be modeled using fractional mixed-integer optimization, which can be tackled by leveraging recent advances in modern optimization solvers. The proposed algorithms involve solving a sequence of mixed-integer quadratic optimization problems (or their convexifications) and can be implemented with off-the-shelf solvers. We report encouraging results in our computational experiments, with respect to both the optimization and statistical performance. Summary of Contribution: This paper considers feature selection problems with information criteria. We show that by adopting a fractional optimization perspective (a well-known field in nonlinear optimization and operations research), it is possible to leverage recent advances in mixed-integer quadratic optimization technology to tackle traditional statistical problems long considered intractable. We present extensive computational experiments, with both synthetic and real data, illustrating that the new fractional optimization approach is orders of magnitude faster than existing approaches in the literature.
科研通智能强力驱动
Strongly Powered by AbleSci AI