Application of reinforcement learning in multi-period weapon portfolio planning problems
CSTR:
Author:
Affiliation:

(1. The Sixty-third Research Institute, National University of Defense Technology, Nanjing 210007, China;2. School of Economics, Zhejiang University of Finance & Economics, Hangzhou 310018, China;3. College of Systems Engineering, National University of Defense Technology, Changsha 410073, China;4. Southwest Electronics and Telecommunication Technology Research Institute, Chengdu 610041, China)

Clc Number:

O22; N94

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Aiming at the difficulties in the choosing and planning in multi-period weapon systems development problems, an optimization simulation approach combining multi-objective optimization algorithm and reinforcement learning technique was proposed. A multi-objective optimization model was built to maximize the capability and minimize the cost of weapon portfolios in each period. Moreover, a solving algorithm based on the non-dominated sorting genetic algorithm-Ⅲ was presented to obtain the Pareto set in each period, based on which an optimization model for multi-period problem was built. The Q-Learning method, one of the reinforcement learning algorithms, searches within the Pareto set using two different ways for the selection of weapon portfolios in each period, whose outcome is used for the selection in the next period and the optimization of the portfolios over the entire planning horizon. An illustrative example was studied to demonstrate the effectiveness of the proposed model and hybrid algorithm, which can support the decision making on the weapons development and planning.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 18,2020
  • Revised:
  • Adopted:
  • Online: September 29,2021
  • Published: October 28,2021
Article QR Code