Posted on Leave a comment

GCN for UAV coverage control


Dai, A., Li, R., Zhao, Z., & Zhang, H. (2020, October). Graph convolutional multi-agent reinforcement learning for UAV coverage control. In 2020 International Conference on Wireless Communications and Signal Processing (WCSP) (pp. 1106-1111). IEEE.


A growing number of unmanned aerial vehicles (UAVs) are being employed as mobile base stations because of their adaptability and capacity for dynamic coverage. However, UAVs must work together in cooperative groups due to their limited computing and energy resources. These clusters provide dynamic, graph-like local networks where UAVs remain linked to exchange data and maximize efficiency. This work presents a new method for managing UAV groups using graph convolutional multi-agent reinforcement learning (MARL). This approach reduces energy consumption, guarantees equity, and improves signal coverage by taking advantage of the reciprocal interactions between UAVs. Simulations indicate the considerable advantages of this strategy, exhibiting increased efficacy and efficiency in UAV network administration.

Issues in existing methods:

  • Limited Coverage Range: Effective collaboration is necessary due to the limiting signal coverage of UAVs.
  • Energy Constraints: The performance and duration of UAV operations are impacted by limited power.
  • Dynamic Topology: The network structure is continuously changing due to fluctuating UAV locations.
  • Quality of Service (QoS): Maintaining a high standard of communication might be difficult.
  • Scalability Issues: When there are more UAVs, many algorithms find it difficult to scale well.
  • Environmental Adaptability: It’s vital to be able to adjust to challenging or disaster-prone situations.


Graph-Based Modeling and Encoding of UAV Observations:

Unmanned Aerial Vehicles (UAVs) can be controlled as detachable base stations by arranging them as nodes in a graph and connecting them according to their communication range and distances. In this configuration, every unmanned aerial vehicle (UAV) obtains local observations, such as position, velocity, and ground user positions, which are crucial for decision-making. Based on these observations, each UAV then acts in accordance with its policy and is subsequently rewarded. To enhance learning, a replay buffer is used to store these encounters, ensuring that past experiences can be revisited for further training. Additionally, a multi-layer perceptron is employed to encode observations into feature vectors (MLP), enabling more efficient processing and interpretation of the collected data.

The framework of DGN from the study by Dai, A., Li, R., Zhao, Z., & Zhang, H. (2020, October)

Training with Convolutional Layers and Q Networks:

A convolutional layer integrates characteristics from nearby nodes using multi-head dot-product attention and to achieve this, the outputs from many attention heads are concatenated and run through a non-linear function after attention weights are computed to ascertain feature correlations. These concatenated features are then fed into the Q network, which is trained using deep Q-learning with present actions guided by future value projections. Importantly, terms for attention weight distribution consistency and Q-value correctness are included in the loss function, ensuring robust learning. With this approach, our method guarantees dynamic and scalable network control, thereby facilitating UAV collaboration while maximizing coverage, fairness, and energy consumption.


  • When used as detachable base stations, UAVs offer efficient signal coverage, particularly in challenging conditions and for brief communication requirements.
  • In order to maintain steady and valid signals, UAVs must collaborate effectively over dynamic local networks due to limited communication resources and battery power.
  • The graph convolutional multi-agent reinforcement learning (MARL) technique, or DGN, improves power consumption, fairness, and signal coverage efficiency.
  • DGN’s decentralized methodology offers strong scalability, enabling UAVs to function well with just local observations.
  • In the future, continuous action control techniques will be investigated. For more accurate UAV movement policies, graph convolution and actor-critic techniques like DDPG may be combined.


Sakthivel R

I am a First-year M.Sc., AIML student enrolled at SASTRA University

I possess a high level of proficiency in a variety of programming languages and frameworks including Python. I have experience with cloud and database technologies, including SQL, Excel, Pandas, Scikit, TensorFlow, Git, and Power BI

Leave a Reply

Your email address will not be published. Required fields are marked *