Abstract:Aiming for missile trajectory planning, an applicable Gym training evironment is established. An intelligent agent network structure and its reward functions are designed based on TD3 deep reinforcement learning framework and according to terminal and process constraints, forming an intelligent trajectory planning method. Through deploying the algorithm on an embedded GPU computing acceleration platform, bias simulation and comparison tests are conducted. The results show that the method can reach the requirements of missile capability and process constraints under different range tasks and effectively overcome environmental disturbances with adaptability to distinct object models. Meanwhile, the method has an extremely fast calculation speed, far surpassing the popular GPOPS-II toolbox. The computation time for single step trajectory command is less than a millisecond so that it can support real-time online trajectory generation, which provides an effective implementation path and technical support for engineering applications.