摘要
The choice of the most promising algorithm to model and predict a particular phenomenon is one of the most prominent activities of the temporal data forecasting. Forecasting (or prediction), similarly to other data mining tasks, uses empirical evidence to select the most suitable model for a problem at hand since no modeling method can be considered as the best. However, according to our systematic literature review of the last decade, few scientific publications rigorously expose the benefits and limitations of the most popular algorithms for time series prediction. At the same time, there is a limited performance record of these models when applied to complex and highly nonlinear data. In this paper, we present one of the most extensive, impartial and comprehensible experimental evaluations ever done in the time series prediction field. From 95 datasets, we evaluate eleven predictors, seven parametric and four non-parametric, employing two multi-step-ahead projection strategies and four performance evaluation measures. We report many lessons learned and recommendations concerning the advantages, drawbacks, and the best conditions for the use of each model. The results show that SARIMA is the only statistical method able to outperform, but without a statistical difference, the following machine learning algorithms: ANN, SVM, and kNN-TSPI. However, such forecasting accuracy comes at the expense of a larger number of parameters. The evaluated datasets, as well detailed results achieved by different indexes as MSE, Theil’s U coefficient, POCID, and a recently-proposed multi-criteria performance measure are available online in our repository. Such repository is another contribution of this paper since other researchers can replicate our results and evaluate their methods more rigorously. The findings of this study will impact further research on this topic since they provide a broad insight into models selection, parameters setting, evaluation measures, and experimental setup.