Reinforcement learning of ballistic maneuver adjustment strategy after missile penetration
CSTR:
Author:
Affiliation:

(1. College of Operational Support, Rocket Force University of Engineering, Xi′an 710025, China;2. The First Military Representative Office of the Rocket Force Equipment Department in Xi′an Region, Xi′an 710025, China)

Clc Number:

TJ765.3

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In order to solve the problem of trajectory maneuver adjustment caused by large deviation of flight trajectory after midcourse penetration of ballistic missile, an optimization model of maneuver adjustment timing strategy was established. A reverse sequence Q learning algorithm for maneuver adjustment was designed, and a Tile coding approximator encoding was used to encode the state characteristics space, and the space was linearly approximated. A reverse-order update strategy mechanism combining Q learning algorithm and Monte Carlo method was constructed, the optimal timing of missile maneuvering adjustment was trained. The simulation results show that the strategy obtained by training 10 000 generations of reinforcement learning algorithm can reliably control the adjustment decision of flight trajectory after missile penetration with the minimum maneuver times under given scenario parameters, which verifies the effectiveness of the method.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 15,2022
  • Revised:
  • Adopted:
  • Online: April 07,2024
  • Published: April 28,2024
Article QR Code