While ultra-dense networks (UDN) greatly enhances network performance, the extensive deployment of small base stations poses significant energy consumption challenges. Traditional ON/OFF base station sleep schemes can alleviate some energy issues. Still, complete shutdowns and lengthy reactivation times of base stations lead to coverage gaps in the network, severely impacting the quality of service delivered to users. In this paper, we introduce a multi-level Sleep Mode (SM) technique, focusing specifically on energy-efficient task offloading in the context of Mobile Edge Computing (MEC) scenarios. To ensure the performance of delay-sensitive services in user devices, we employ stochastic network calculus (SNC) theory to analyze the stability of the two-stage system. Combining the SNC-derived delay bounds, we propose a Multi-Agent Deep Deterministic Policy Gradient (MADDPG) based approach, which we refer to as SNC-MADDPG. This approach aims to minimize long-term system energy consumption. Numerical results demonstrate that the proposed algorithm achieves more significant energy savings under reliability constraints than other optimization algorithms. Furthermore, the results indicate that the multi-level sleep mode outperforms the traditional ON/OFF base station sleep schemes in meeting the reliability requirements of delay-sensitive applications.