Ahmed Rabee Sayed,Xian Zhang,Guibin Wang,Jing Qiu,Cheng Wang
出处
期刊:IEEE Transactions on Power Systems [Institute of Electrical and Electronics Engineers] 日期:2023-09-28卷期号:39 (2): 2893-2906被引量:6
标识
DOI:10.1109/tpwrs.2023.3320172
摘要
Increasing interdependencies between power and gas systems and integrating large-scale intermittent renewable energy increase the complexity of energy management problems. This article proposes a model-free safe deep reinforcement learning (DRL) approach to find fast optimal energy flow (OEF), guaranteeing its feasibility in real-time operation with high computational efficiency. A constrained Markov decision process model is standardized for the optimization problem of OEF with a limited number of state and control actions and developing a robust integrated environment. Because state-of-the-art DRL algorithms lack safety guarantees, this article develops a soft-constraint enforcement method to adaptively encourage the control policy in the safety direction with non-conservative control actions. The overall procedure, namely the constrained soft actor-critic (C-SAC) algorithm, is off-policy, entropy maximization-based, sample-efficient, and scalable with low hyper-parameter sensitivity. The proposed C-SAC algorithm validates its superiority over the existing learning-based safety ones and OEF solution methods by finding fast OEF decisions with near-zero degrees of constraint violations. The proposed approach indicates its practicability for real-time energy system operation and extensions for other potential applications.