Enhanced Fixed-wing Leader-followers UAV Formation Control Integrating Pre-tuned TD3 Reinforcement Learning and Consensus-based Control Methods

Authors

  • Huda Naji Al-Sudany
    Affiliation
    Department of Control Engineering and Information Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary
  • Béla Lantos
    Affiliation
    Department of Control Engineering and Information Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary
https://doi.org/10.3311/PPee.40919

Abstract

This paper addresses a critical challenge in the field of attitude control of fixed-wing Unmanned Aerial Vehicles (UAVs), focusing on the leader and multi-followers′ problem. It introduces Twin Delayed Deep Deterministic Policy Gradient (TD3) based on Cascade-forward ANN networks approach to control the leader and multiple followers in autonomous navigation and path tracking. The TD3 component is initially trained off-line and then applied in real-time. The control system incorporates a novel adaptive Laplacian consensus protocol using an undirected communication graph model that adjusts inter-UAV connection weights in real time based on relative positions. This approach was implemented on a leader-follower formation consisting of one leader and three follower UAVs. The paper includes a stability analysis of the proposed method, demonstrating the system′s overall stability. The effectiveness of this approach is validated through a MATLAB simulation which demonstrates that the TD3 based on ANN (Cascade-forward networks), which is used to train the actor and the twin critics networks, has superior performance with low tracking error, good formation keeping during aggressive maneuvers, and reduced control surface oscillations. The application exhibited improved adherence to the prescribed distance during sharp turns, with fewer formation deviations at the trajectory end point. The findings result verify that combining strategies leads to the formation integrity. The TD3 based on ANN which uses Cascade-forward networks implementation offers also significant improvement in stability, accuracy, and control efficiency for practical UAV formation control applications.

Keywords:

unmanned aerial vehicles (UAVs), attitude control, artificial neural network (ANN), reinforcement learning (RL), twin delayed deep deterministic policy gradient (TD3), Laplacian consensus algorithm

Citation data from Crossref and Scopus

Published Online

2025-07-16

How to Cite

Al-Sudany, H. N., Lantos, B. “Enhanced Fixed-wing Leader-followers UAV Formation Control Integrating Pre-tuned TD3 Reinforcement Learning and Consensus-based Control Methods”, Periodica Polytechnica Electrical Engineering and Computer Science, 2025. https://doi.org/10.3311/PPee.40919

Issue

Section

Articles