Grain temperature forecasting is crucial to ventilation management in granary, as it facilitates precautions against grain mold caused by grain temperature increase. Despite of the capability in the feature extraction of nonlinear temperature data, frontier forecasting models empowered by artificial intelligence are found to be limited in the forecast efficiency and accuracy. All existing models show insufficient efficiency as their outputs are limited to either single representative sensors or average temperature values of certain layers at a time. Most of these models fail to take into account a spatial topology of the sensor network, which hinders higher forecast accuracy. This paper therefore proposes a multi-output and spatiotemporal model that combines Graph Convolution Neural Networks (GCN) and Transformer to address such issues. GCN captures the spatial correlations of the sensors and topological information of the sensor network in the granary. Transformer captures both long-term and short-term temporal features and describe temporal dependencies. Drawing on a real-granary dataset from the granary of Shaanxi, China, the proposed model is constructed whose performance is evaluated and compared with those of four existing models. Results demonstrate that the proposed model outperforms others by MAE and RMSE. Furthermore, a continuous temperature field of the entire granary is enabled by a three-dimensional interpolation based on the forecast results, which makes accessible the temperature conditions of all locations besides the discrete ‘sensored’ areas.