基于马尔科夫决策的目标选择策略

doi:10.11887/j.cn.201402027

首页 > 过刊浏览>2014年第36卷第2期 >161-167. DOI:10.11887/j.cn.201402027

基于马尔科夫决策的目标选择策略
DOI:
                        10.11887/j.cn.201402027
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金资助项目（61273322,71001105, 91024006）

Research on the method of target selecting policy based on  the Markov decision process

Author:

Affiliation:

摘要

图/表

访问统计

参考文献

相似文献

引证文献()

资源附件

文章评论

摘要:

目标选择是军事计划的关键要素之一。基于马尔科夫决策方法，解决具有复杂目标间关联的多阶段目标选择问题。使用与或树描述目标体系各层状态间的影响关联，并以目标体系整体失效为求解目的，建立了基于离散时间MDP的多阶段打击目标选择模型。在LRTDP算法基础上提出一种启发式方法，通过判断从当前目标体系状态到达体系失效状态的演化过程中的可能资源消耗和失败概率，来提供对当前状态的评估值，该方法能有效排除问题搜索空间中不能到达体系失效目的的中间状态，压缩了由于目标间复杂关联而增长的巨大状态空间。用实验验证了该方法有效性，实验结果表明，该方法直观实用，对目标间具有复杂关联关系的目标打击决策有一定参考价值。

Abstract:

Target selecting is an important aspect of military operational planning. The Markov Decision Process(MDP) method was used to solve the multi-phase target selecting problem which has complex relations among targets. Firstly, the and-or tree was used to describe the relations among the layers of the target system of system(TSoS), and a Discrete Time Markov Decision Process(DTMDP) method was proposed for modeling target selecting whose objective was to neutralize the TSoS. Secondly, an LRTDP algorithm based heuristic was proposed to give the estimate value of the current state of the TSoS, which was calculated by considering the potential resource consumption and failure probability of the evolution process from the current state to the lapse state of the TSoS, and the heuristic can effectively exclude the intermediate states which cannot be transferred to the lapse state, in order to reduce the huge search space of the model because of the complex relations among targets. Finally, a case was proposed to validate the method. The results show that the method is intuitive and practical, and can facilitate the target selecting decision making when there are complex relations among the targets.

参考文献

相似文献

引证文献

引用本文

雷霆,朱承,张维明.基于马尔科夫决策的目标选择策略[J].国防科技大学学报,2014,36(2):161-167.
LEI Ting, ZHU Cheng, ZHANG Weiming. Research on the method of target selecting policy based on  the Markov decision process[J]. Journal of National University of Defense Technology,2014,36(2):161-167.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2013-07-16
最后修改日期:
录用日期:
在线发布日期: 2014-05-14
出版日期:

首页

期刊介绍

投稿指南

编委会

出版声明

开放获取声明

联系我们

期刊订阅

Rss

AI检索

English

引用本文

分享

文章指标

历史

文章二维码