Abstract:Aiming at the dynamic force deployment problem, a multi-agent reinforcement learning strategy planning method based on SVD (Shapley value decomposition)was proposed. The reward distribution among cooperative multi-agents was explained by SVD, and the reward distribution was analysed by SVD reinforcement learning method to solve Markov convex game strategy. Secondly, based on the scenario of naval and air cross-domain cooperative confrontation, the allocation of space domain combat resources in heterogeneous multi-entity cooperative confrontation was analysed, a dynamic force deployment strategy planning model was built, and the state space, action space and reward function of the problem were designed. Finally, based on typical application scenarios, simulation experiments were organized to verify the dynamic force deployment problem with the military chess deduction system. Results show that compared with the multi-class baseline algorithm, the proposed method has excellent performance in strategic planning of dynamic force deployment, and it is theoretically interpretable. The proposed method learns the strategy of "layer upon layer interception, zone confrontation, core cover, and hierarchical breaking".