Because of its nonlinearity and path-dependency, analysis of the elasto-plastic behavior of the finite element (FE) model is computationally expensive. By directly learning sequential data, modeling plasticity via deep learning has shown successful performance in immediately predicting the path-dependent response. However, large-scale elasto-plastic FE models still have challenges in that they require numerous degrees of freedom and are accompanied by high-dimensional data. This study proposes a practical framework for the surrogate modeling of a large-scale elasto-plastic FE model by combining long short-term memory (LSTM) neural networks with proper orthogonal decomposition (POD). First, displacement, plastic strain magnitude, and von Mises stress are generated using commercial FE analysis software, and then, the high-dimensional data are reduced to low-dimensional POD coefficient data before being used for training. With the drastically reduced data, a neural network architecture can be introduced in the form of individual and ensemble structures to achieve accurate and robust prediction. As the proposed POD-LSTM surrogate model operates on the structural level, POD-LSTM surrogate models are constructed and implemented for each of the three large-scale elasto-plastic FE models. In all three examples, the proposed POD-LSTM surrogate models were found to be efficient and accurate for predicting elasto-plastic responses.