Posted on Leave a comment

Multi-energy microgrid through RL


Wang, Y., Qiu, D., Sun, M., Strbac, G., & Gao, Z. (2023). Secure energy management of multi-energy microgrid: A physical-informed safe reinforcement learning approach. Applied Energy, 335, 120759.


Integrating distributed energy resources speeds up the transition to a low-carbon future but also complicates safe and reliable operations. A workable solution is Multi-Energy Microgrids (MEMGs), which combine several energy sources to enhance stability. Model-free learning and standard optimization techniques have addressed MEMG energy management. However, traditional reinforcement learning (RL) often struggles with physical limits, risking secure operations. To address this, a new safe RL technique is proposed. It includes a dynamic security assessment layer that respects physical boundaries by resolving an action corrective formulation. This ensures safe handling of MEMGs during training and testing. Extensive investigations show this physical-informed RL method outperforms classic RL and optimization strategies in constraint-compliant, cost-effective MEMG energy management.

Limitations in the existing methods:

  • Accurate system knowledge is impractical to gain because of privacy concerns and system aging.
  • Extensive optimization for every possible state.
  • Model-free learning finds it difficult to represent physical limitations.
  • Danger of doing unsafe procedures without full system awareness.


PI-SPPO Approach for MEMG Energy Management

The proposed PI-SPPO method addresses the MEMG energy management problem within physical constraints. It incorporates a physical-informed safety layer, a model-free PPO control mechanism, and security evaluation criteria. A safe operation region is roughly represented by the security assessment rule, which is a part of the safety layer. Using an actor-critic architecture, the PPO control policy effectively manages high-dimensional state and action spaces.

Safety Layer and Training Process

The safety layer automatically corrects actions from the PPO policy to ensure safe operations by solving an optimization problem based on the security rule. This meticulous procedure guarantees that the control plan respects all physical constraints while maintaining stability and sampling efficiency. Through continuous online learning, the safety layer—which had previously received comprehensive offline instruction—improves in both accuracy and adaptability throughout the training process. This innovative approach ensures safe and efficient energy management for MEMG systems by effectively balancing exploration and exploitation dynamics.


Structure of PI-SPPO from the study by Wang, Y., Qiu, D., Sun, M., Strbac, G., & Gao, Z. (2023)



  • PI-SPPO method for MEMG energy management was introduced.
  • Tuned hyperparameters and sample efficiency using PPO.
  • Utilizing supervised learning, a security assessment rule was trained.
  • A safety layer that is integrated for secure MEMG operations.
  • Proven efficacy in practical energy scheduling.
  • Lower expenses for energy management.
  • Made sure MEMGs were operating securely.
  • Future work should encompass the cooling and heating industries and develop robust learning for aspects of exogenous states in future work.

Sakthivel R

I am a First-year M.Sc., AIML student enrolled at SASTRA University

I possess a high level of proficiency in a variety of programming languages and frameworks including Python. I have experience with cloud and database technologies, including SQL, Excel, Pandas, Scikit, TensorFlow, Git, and Power BI

Leave a Reply

Your email address will not be published. Required fields are marked *