Posted on Leave a comment

DRL for Energy optimization in buildings


 Qin, Y., Ke, J., Wang, B., & Filaretov, G. F. (2022). Energy optimization for regional buildings based on distributed reinforcement learning. Sustainable Cities and Society, 78, 103625.


Because of their affordability and scalability, model-free control methods like Reinforcement Learning (RL) are highly respected in the field of energy management. The inability of RL to offer coordinated and efficient control to regional buildings leads to higher energy consumption. This blog looks at Distributed Reinforcement Learning (DRL), a cutting-edge technique for maximizing energy utilization across several buildings while preserving tenant comfort. By exchanging parameters and coordinating optimization, the system effectively reduces energy use. DRL scored better in terms of overall energy utilization than Rule-Based Control (RBC), Soft Actor-Critic (SAC) approach, Model Predictive Control (MPC), and Non-dominated Sorting Genetic Algorithm II (NSGA-II), as demonstrated by a case study of nine university buildings. Additionally, the recommended strategies demonstrated exceptional accuracy and robustness in evaluations of energy consumption across many buildings, error analysis, load factor, power demand, and net power consumption

Limitations in the existing methods: 

  • Inadequate changes to RBC’s basic “If X, then Y” criteria frequently impede full potential development.   
  •  Although MPC works well in simulations, its applicability in real-world scenarios is limited due to its lack of large-scale, fully populated building implementation.
  •  Overfitting and excessive variance are problems for data-driven models like random forest predictive control, which provide inaccurate outcomes. 
  • Due to their complexity, methods like MPC, RBC, and GA are difficult to scale for use in regional or local applications. 
  • Difficulties in gathering data, extended sample periods, and environmental disruptions lead to incomplete datasets, which compromise the precision and efficacy of models. 
  • Ineffective building-to-building coordination caused by current systems results in energy coupling and increased total consumption.  


Scheme for Control Optimization Employing DRL:  

The study offers a control optimization strategy based on Distributed Reinforcement Learning (DRL) to reduce energy consumption in regional buildings. Based on the CityLearn framework, the proposed method enhances the MARLISA algorithm with an improved Least Square Boosting (LSBoost) algorithm for energy prediction and an incentive system to promote staggered power usage. Many reinforcement learning agents use a sequential action selection strategy that iteratively selects actions to control each building’s energy usage.

Improvements and Iterative Education: 

While the Soft Actor-Critic (SAC) method is used for its scalability and coordination skills, the LSBoost technique is improved to increase prediction accuracy. Agents engage in environmental interaction, optimizing energy use through performance-based rewards or penalties. Agents can forecast and share energy usage to optimize regional energy consumption and preserve human comfort by coordinating energy consumption across buildings using an iterative learning process. By lowering energy use and carbon emissions, this strategy hopes to promote sustainable urban growth. 


Schematic representation of the building optimal control closed-loop system using the MPC/GA controller from the study by Qin, Y., Ke, J., Wang, B., & Filaretov, G. F. (2022) 


  • Developed a regional building energy optimization using distributed multi-agent reinforcement learning. 
  • Improved LSBoost algorithm for more accurate energy consumption predictions. 
  • Layer normalization of critic networks speeds up training. 
  • Huber loss avoids exploding gradients and oversensitivity to outliers. 
  • Numerical simulations show DRL reduces energy consumption by 6.72% over RBC and 3.67% over SAC. 
  • DRL method is scalable and provides good energy optimization. 
  • Decentralized distributed network with no central control enhances system structure. 
  • System coordinates energy consumption among agents for scalability.

Sakthivel R

I am a First-year M.Sc., AIML student enrolled at SASTRA University

I possess a high level of proficiency in a variety of programming languages and frameworks including Python. I have experience with cloud and database technologies, including SQL, Excel, Pandas, Scikit, TensorFlow, Git, and Power BI

Leave a Reply

Your email address will not be published. Required fields are marked *