深度强化学习在导弹弹道规划中的应用
作者:
作者单位:

1.国防科技创新研究院无人系统技术研究中心;2.哈尔滨工业大学控制理论与制导技术研究中心;3.哈尔滨工业大学空间环境与物质科学研究院;4.国防科技大学空天科学学院

作者简介:

通讯作者:

中图分类号:

TP18;TP27;V24

基金项目:


Application of deep reinforcement learning to missile trajectory planning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献()
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对导弹弹道规划问题,搭建了适用性的Gym训练环境,基于TD3深度强化学习框架设计了智能体网络结构,根据终端约束和过程约束设计奖励函数,形成了智能弹道规划方法。通过部署于嵌入式GPU计算加速平台,开展了拉偏仿真和对比测试,结果表明该方法在不同射程任务要求下能够满足导弹能力和过程约束,有效克服环境干扰,具有针对不同对象模型的适应性。同时,该方法计算速度极快,远超流行的GPOPS-II工具箱,单步弹道指令计算用时在毫秒以下,能够支持实时在线弹道生成,为工程应用提供了有效实现途径和技术支撑。

    Abstract:

    Aiming for missile trajectory planning, an applicable Gym training evironment is established. An intelligent agent network structure and its reward functions are designed based on TD3 deep reinforcement learning framework and according to terminal and process constraints, forming an intelligent trajectory planning method. Through deploying the algorithm on an embedded GPU computing acceleration platform, bias simulation and comparison tests are conducted. The results show that the method can reach the requirements of missile capability and process constraints under different range tasks and effectively overcome environmental disturbances with adaptability to distinct object models. Meanwhile, the method has an extremely fast calculation speed, far surpassing the popular GPOPS-II toolbox. The computation time for single step trajectory command is less than a millisecond so that it can support real-time online trajectory generation, which provides an effective implementation path and technical support for engineering applications.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-05-08
  • 最后修改日期:2025-05-07
  • 录用日期:2023-10-19
  • 在线发布日期: 2025-04-03
  • 出版日期:
文章二维码