Next-POI recommendation aims to explore from user check-in sequence to predict the next possible location to be visited. Existing methods are often difficult to model the implicit association of multi-modal data with user choices. Moreover, traditional methods struggle to fully explore the variation of user preferences at variable time intervals. To tackle these limitations, we propose a Multi-Modal Temporal Knowledge Graph-aware Sub-graph Embedding approach (Mandari). We first construct a novel Multi-Modal Temporal Knowledge Graph. Based on the proposed knowledge graph, we integrate multi-modal information and leverage the graph attention network to calculate sub-graph prediction probability. Next, we implement a temporal knowledge mining method to model the segmentation and periodicity of user check-in and obtain temporal prediction probability. Finally, we fuse temporal prediction probability with the previous sub-graph prediction probability to obtain the final result. Extensive experiments demonstrate that our approach outperforms existing state-of-the-art methods.