Posted on Leave a comment

PV inverter based on VAR control using GRL


Yan, R., Xing, Q., & Xu, Y. (2023). Multi agent safe graph reinforcement learning for PV inverter s based real-time de centralized volt/VAR control in zoned distribution networks. IEEE Transactions on Smart Grid. 


This research presents a multi-agent safe graph reinforcement learning approach for PV inverter reactive power output optimization, thereby enabling real-time voltage and var control (VVC) in active distribution networks (ADNs). Specifically, each zone in the network adopts a decentralized architecture for coordinating reactive power regulation, managing voltage profiles, and reducing energy loss. To describe the VVC issue, a multi-agent, decentralized, partially observable, restricted Markov decision process is employed. Furthermore, graph convolutional networks (GCNs) are utilized by central control agents in every zone to enhance decision-making through the extraction of features from the ADN topology, noise filtering, and imputing of missing data. By optimizing primal-dual policies, the approach ensures that voltage safety requirements are fulfilled. Consequently, this method successfully reduces network energy loss and voltage variations, as evidenced by simulations on a 141-bus distribution system.

Limitations in the existing methods: 

  • High ratio of resistance to inductance has an impact on voltage stability.  
  • Voltage variations are caused by intermittent photovoltaic production. 
  • Slow mechanical devices are used in traditional VVC.  
  • For PV fluctuations, OLTCs and CBs are not quick enough.  
  • Effective real-time VVC techniques are required.  
  • For PV inverters, better control models are needed.  
  • The VVC techniques now in use are rigid.  
  • IEEE 1547.8, which is vague, promotes the usage of inverters.  


The MAPDGRL Approach  

The decentralized partially observable Markov decision process (Dec-POCMDP) problem for PV inverter-based voltage/var control (VVC) in active distribution networks (ADNs) is addressed by the Multi-Agent Primal-Dual Graph Reinforcement Learning (MAPDGRL) technique. This method centrally teaches agents optimal policies, which are then implemented in a decentralized manner based on local observations. Using a layer propagation mechanism to normalize the graph structure and aggregate data, Graph Convolutional Networks (GCNs) extract the power network’s graph-structured properties.

The actor network employs a fully connected neural network following a multi-layer GCN to map partial observations to control actions. Additionally, deep neural networks (DNNs) in reward and cost critique networks estimate predicted rewards and constraint costs. Dual variables are established and updated using sampled dual gradients to ensure constraints are met.

Robustness and Training  

Initializing parameters, centralized training using replay buffers, and recurring updates to dual variables and network parameters are all integral parts of the training process. Specifically, the message-passing mechanism of GCNs, which enables the filling in of missing data using information from surrounding nodes, and the usage of Graph Fourier Transform for filtering, further improve robustness against noise and missing data. Consequently, robust against noise and missing data, this technique includes GCNs for feature extraction, dual variables for constraint satisfaction, and a multi-agent DRL framework for optimum VVC in ADNs.


  • The proposed MAPDGRL method significantly improves real-time voltage/var control (VVC) in distribution networks with high PV penetration.
  •  By incorporating GCNs into the policy network of agents, it enhances decision-making and automatically handles imperfect measurement data.
  • Additionally, it uses Lagrangian relaxation to manage voltage constraints effectively, merging topology and feature information for better knowledge representation.
  • Furthermore, it acts as a graph-structured “low-pass filter” to reduce noise and fill in missing information through message passing.
  •  Consequently, simulations on a 141-bus testing network show that MAPDGRL outperforms benchmark algorithms like MADDPG and MAPDDDPG in terms of efficiency, effectiveness, and robustness, even with incomplete or noisy feature information.

Sakthivel R

I am a First-year M.Sc., AIML student enrolled at SASTRA University

I possess a high level of proficiency in a variety of programming languages and frameworks including Python. I have experience with cloud and database technologies, including SQL, Excel, Pandas, Scikit, TensorFlow, Git, and Power BI

Leave a Reply

Your email address will not be published. Required fields are marked *