Enhanced Fixed-wing Leader-followers UAV Formation Control Integrating Pre-tuned TD3 Reinforcement Learning and Consensus-based Control Methods
Abstract
This paper addresses a critical challenge in the field of attitude control of fixed-wing Unmanned Aerial Vehicles (UAVs), focusing on the leader and multi-followers′ problem. It introduces Twin Delayed Deep Deterministic Policy Gradient (TD3) based on Cascade-forward ANN networks approach to control the leader and multiple followers in autonomous navigation and path tracking. The TD3 component is initially trained off-line and then applied in real-time. The control system incorporates a novel adaptive Laplacian consensus protocol using an undirected communication graph model that adjusts inter-UAV connection weights in real time based on relative positions. This approach was implemented on a leader-follower formation consisting of one leader and three follower UAVs. The paper includes a stability analysis of the proposed method, demonstrating the system′s overall stability. The effectiveness of this approach is validated through a MATLAB simulation which demonstrates that the TD3 based on ANN (Cascade-forward networks), which is used to train the actor and the twin critics networks, has superior performance with low tracking error, good formation keeping during aggressive maneuvers, and reduced control surface oscillations. The application exhibited improved adherence to the prescribed distance during sharp turns, with fewer formation deviations at the trajectory end point. The findings result verify that combining strategies leads to the formation integrity. The TD3 based on ANN which uses Cascade-forward networks implementation offers also significant improvement in stability, accuracy, and control efficiency for practical UAV formation control applications.