Deep neural network models have shown a great potential in accelerating the simulation of fluid dynamic systems. Once trained, these models can make inference within seconds, thus can be extremely efficient. However, they suffer from a generalization problem when the flow becomes chaotic and turbulent. One of the most important reasons is that, existing models lack the mechanism to handle the unique characteristic of turbulent flow: multi-scale flow structures are non-uniformly distributed and strongly nonequilibrium. In this work, we address this issue with the concept of visual attention: intuitively, we expect the attention module to capture the nonequilibrium of turbulence by automatically adjusting weights on different regions. We benchmark the performance improvement with a state of the art neural network model, the Fourier Neural Operator (FNO), on two-dimensional (2D) turbulence prediction task. Numerical experiments show that the attention-enhanced neural network model can generalize well on higher Reynolds numbers flow, and can accurately reconstruct a variety of statistics and instantaneous spatial structures of turbulence. The attention mechanism provides 40% error reduction with 1% increase of parameters, at the same level of computational cost.