Abstract:Aiming at the station keeping control problem of Stratospheric aerostat in dynamic wind field, a station keeping controller designed based on deep reinforcement learning D3QN algorithm for different control channels of aerostat operated with ambient wind, studied the impact of different reward functions on the performance of regional resident controllers. Station keeping control simulation was carried out under the task constraint of a station keeping duration of three days and a station keeping radius of 50 km. Results show that: compared with the station keeping controller designed by DDQN method, the performance of the controller designed by D3QN method is significantly improved. When the control trajectory of aerosat is only adjusted by altitude, the average station keeping radius can reach 25.26 km, and the station keeping ratio is 96%. With the aid of horizontal propulsion, the average station keeping radius can be significantly reduced and the station keeping time ratio can be significantly increased. At the same time, the strong robustness of the station keeping controller based on deep reinforcement learning was verified, and the controller can be designed with different reward functions to meet the requirements of different station keeping tasks.