作者
Dawen Xia,Nan Yang,Shunying Jian,Hu Yang,Huaqing Li
摘要
Accurate traffic flow forecasting (TFF) is significant for mitigating traffic congestion. To address the existing issues of calculation and storage in dealing with big traffic flow data using the traditional centralized models on a single machine, this paper presents a Spark-based Weighted Bidirectional Long Short-Term M emory (SW-BiLSTM) model to improve the robustness and accuracy of TFF. Specifically, the resilient distributed dataset (RDD) and the Kalman filter (KF) are utilized to preprocess large-scale trajectory data (e.g., GPS trajectories of taxicabs). Next, a distributed SW-BiLSTM model on Spark is put forward to enhance the accuracy and efficiency of TFF, combined with the normal distribution for weighing the influence degree of the interaction between adjacent road segments and the time window for implementing the optimization of BiLSTM. Finally, the experimental results on an empirical study with the real-world taxi GPS trajectory data indicate that, compared with ARIMA, LR, GNB, CNN, GRU, SAEs, BP, LSTM, and WND-LSTM (LSTM with a time window and a normal distribution), the MAPE value of SW-BiLSTM is decreased by 65.62%, 17.78%, 87.29%, 69.10%, 3.52%, 21.09%, 59.66%, 42.86%, and 1.22%, respectively. In particular, SW-BiLSTM is superior to BiLSTM with 15.83% accuracy improvement on average.