导弹突防后弹道机动调整策略强化学习

doi:10.11887/j.cn.202402010

首页 > 过刊浏览>2024年第46卷第2期 >94-103. DOI:10.11887/j.cn.202402010

导弹突防后弹道机动调整策略强化学习
DOI:
                        10.11887/j.cn.202402010
                    
作者:
                        
                        
                    
作者单位:(1. 火箭军工程大学 作战保障学院, 陕西 西安 710025;2. 火箭军装备部驻西安地区第一军事代表室, 陕西 西安 710025)
作者简介:樊博璇(1987—),女,吉林扶余人,工程师,博士,E-mail:1092442646@qq.com
通讯作者:
中图分类号:TJ765.3
基金项目:国家自然科学基金资助项目(71601180)

Reinforcement learning of ballistic maneuver adjustment strategy after missile penetration

Author:

Affiliation:

(1. College of Operational Support, Rocket Force University of Engineering, Xi′an 710025, China;2. The First Military Representative Office of the Rocket Force Equipment Department in Xi′an Region, Xi′an 710025, China)

摘要

图/表

访问统计

参考文献

相似文献

引证文献()

资源附件

文章评论

摘要:

针对弹道导弹中段突防后飞行弹道与标准弹道产生较大偏离的弹道机动调整问题,建立了机动调整时机策略最优化模型。设计了机动调整逆序Q学习算法,采用Tile coding逼近器编码状态特征空间,并对其进行线性逼近。构建了Q学习算法与蒙特卡罗方法相结合的逆序更新策略机制,以对导弹机动调整最优时机进行训练。仿真测试分析结果表明,在给定场景参数下,通过10 000代强化学习算法训练得到的策略能够可靠地使用最少机动次数控制导弹突防后飞行弹道的调整决策,验证了方法的有效性。

Abstract:

In order to solve the problem of trajectory maneuver adjustment caused by large deviation of flight trajectory after midcourse penetration of ballistic missile, an optimization model of maneuver adjustment timing strategy was established. A reverse sequence Q learning algorithm for maneuver adjustment was designed, and a Tile coding approximator encoding was used to encode the state characteristics space, and the space was linearly approximated. A reverse-order update strategy mechanism combining Q learning algorithm and Monte Carlo method was constructed, the optimal timing of missile maneuvering adjustment was trained. The simulation results show that the strategy obtained by training 10 000 generations of reinforcement learning algorithm can reliably control the adjustment decision of flight trajectory after missile penetration with the minimum maneuver times under given scenario parameters, which verifies the effectiveness of the method.

参考文献

相似文献

引证文献

引用本文

樊博璇,陈桂明,韩磊,等.导弹突防后弹道机动调整策略强化学习[J].国防科技大学学报,2024,46(2):94-103.
FAN Boxuan, CHEN Guiming, HAN Lei, et al. Reinforcement learning of ballistic maneuver adjustment strategy after missile penetration[J]. Journal of National University of Defense Technology,2024,46(2):94-103.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-01-15
最后修改日期:
录用日期:
在线发布日期: 2024-04-07
出版日期: 2024-04-28

首页

期刊介绍

投稿指南

编委会

出版声明

开放获取声明

联系我们

期刊订阅

Rss

AI检索

English

引用本文

分享

文章指标

历史

文章二维码