Real-time and accurate traffic state forecasting of urban roads is of great significance to improve traffic efficiency and optimize travel routes. However, future traffic state forecasting is still a challenging issue as it is influenced by several complicated factors including the dynamic spatio-temporal dependencies. Existing models usually consider the dependencies from the road sections with physical connections and ignore the road sections without physical connections. To this end, this paper proposes a deep ensemble neural network (DENN) model to improve the accuracy of urban traffic state forecasting by forming the road sections with high relevance into a virtual graph. To capture the spatio-temporal characteristics efficiently and simultaneously, the DENN integrates the graph convolutional neural network, bidirectional gated recurrent unit network, and a dense layer with attention mechanism into an end-to-end fashion. Validated on two ground-truth urban traffic speed datasets, the DENN model can well fit the nonlinear fluctuation of urban speed and indicate superior performance than the state-of-the-art benchmark methods in terms of prediction precision and robustness. • The virtual network is established by forming the road sections with high relevance into a virtual graph. • The graph convolutional network (GCN) is introduced to mine spatial features of the traffic flow from virtual graph. • Deep ensemble neural network is built by fusing a GCN, Bi-GRU network, and attention model into an end-to-end fashion. • Real-world urban traffic datasets are used to verify the proposed model in terms of prediction accuracy and stability.