强信息约束下的飞行器智能协同机动决策方法

丘沛桓; 倪炜霖; 吴志刚; 梁海朝

Chinese

您当前的位置：

首页 >

文章列表页 >

强信息约束下的飞行器智能协同机动决策方法

更新时间：2026-05-06

- 强信息约束下的飞行器智能协同机动决策方法
- Intelligent cooperative maneuver decision-making approach for vehicles under strong information constraints
- 哈尔滨工业大学学报 2026年58卷第4期
- 作者机构：
  
  中山大学航空航天学院,广东,深圳,518107
- 作者简介：
- 基金信息：
  
  国家自然科学基金(62388101)
- DOI：
  中图分类号： V11
- 纸质出版：2026
- 稿件说明：
移动端阅览
丘沛桓,倪炜霖,吴志刚,梁海朝.强信息约束下的飞行器智能协同机动决策方法[J].哈尔滨工业大学学报,2026,58(4):11.

QIU Peihuan, NI Weilin, WU Zhigang, et al. Intelligent cooperative maneuver decision-making approach for vehicles under strong information constraints[J]. 2026, 58(4).
丘沛桓,倪炜霖,吴志刚,梁海朝.强信息约束下的飞行器智能协同机动决策方法[J].哈尔滨工业大学学报,2026,58(4):11. DOI：10.11918/202503020

QIU Peihuan, NI Weilin, WU Zhigang, et al. Intelligent cooperative maneuver decision-making approach for vehicles under strong information constraints[J]. 2026, 58(4). DOI：

摘要

为实现高超声速飞行器在“目标拦截者防御者”多角色博弈场景下对拦截飞行器的逃逸

其需要与防御飞行器执行协同机动策略。然而

由于探测装置限制

高超声速飞行器面临非完美、非完备和非完整等强信息约束下的协同机动决策问题。为此

结合多智能体深度强化学习算法

提出了一种端到端协同机动决策方法

使高超声速飞行器能够在强信息约束下进行协同机动

进而成功逃逸。首先

将研究场景建模为分布式部分可观测马尔可夫决策过程

并提出一种观测信息共享堆叠机制

用于设计受强信息约束的局部观测状态空间。其次

针对多智能体强化学习稀疏奖励问题

构造一种结合博弈关系与零控脱靶量的多智能体合作决策奖励函数

提高多智能体系统在复杂博弈场景中的训练效率。最后

设计由基础智能体网络和顶层值分解网络构成的多智能体协同决策网络架构

能够从非完美、非完备和非完整信息中提取飞行器的时空轨迹特征

实现智能体系统的策略协调与飞行器的协同机动决策。结果表明

搭载所提出的智能协同机动决策方法的高超声速飞行器能够在强信息约束下的多角色博弈场景中成功逃逸

并在典型博弈场景与蒙特卡洛测试等数值仿真中展现了出色的效能和鲁棒性。

Abstract

To achieve the escape of a hypersonic vehicle from an interceptor in a multi-role game scenario of “target-interceptor-defender”

it is necessary to execute a cooperative maneuver strategy with the defender. However

due to the limitations of the detection device

hypersonic vehicles face the problem of cooperative maneuver decision-making with imperfect

incomplete

and intermittent strong information constraints. To address this

this paper proposed an end-to-end cooperative maneuver decision-making approach by integrating a multi-agent deep reinforcement learning algorithm

enabling hypersonic vehicles to make cooperative maneuver decisions under strong information constraints and achieve successful evasion. First

the research scenario was modeled as a decentralized partially observable Markov decision process

and an observation information sharing stacking mechanism was proposed for the design of local observation state spaces under the strong information constraints. Second

to address the sparse reward problem in multi-agent deep reinforcement learning

a cooperative decision-making reward function was constructed by integrating game relationships and zero-effort miss distance

enhancing training efficiency in complex game scenarios. Finally

a multi-agent cooperative decision-making network architecture was designed

comprising the agents basic networks and the top value decomposition network. This architecture extracted spatio-temporal trajectory features from imperfect

incomplete

and intermittent information

enabling policy coordination among agents and cooperative maneuver decision-making for vehicles. Research results demonstrate that hypersonic vehicles equipped with the proposed intelligent cooperative maneuver decision-making approach can successfully evade in multi-role game scenarios under strong information constraints. The proposed approach exhibits outstanding performance and robustness in numerical simulations

including typical game scenarios and Monte Carlo tests.

关键词

Keywords

references

CHEN Jieqing, SUN Ruisheng, LU Yu.Cooperative game penetration guidance for multiple hypersonic vehicles under safety critical framework[J].Chinese Journal of Aeronautics, 2024, 37(1): 247.DOI: 10.1016/j.cja.2023.08.023

郭建国, 陆东陈, 周敏.飞行器博弈制导进程与展望[J].航空兵器, 4,1(2): 8GUO Jianguo, LU Dongchen, ZHOU Min.Analysis of the progress of aircraft game guidance[J].Aero Weaponry, 4,1(2): 8.DOI: 10.12132/ISSN.1673-5048.2024.0022

PERELMAN A, SHIMA T, RUSNAK I.Cooperative differential games strategies for active aircraft protection from a homing missile[J].Journal of Guidance, Control, and Dynamics, 1,4(3): 761.DOI: 10.2514/1.51611

JIA Zhen, YE Dong, XIAO Yan, et al.Approximate analytical approach for spacecraft pursuit-evasion game with reachability analysis[J].IEEE Transactions on Aerospace and Electronic Systems, 5,1(4): 9058.DOI: 10.1109/TAES.2025.3552073

MISHLEY A, SHAFERMAN V.Near-optimal evasion from acceleration bounded modern pursuers[J].Journal of Guidance, Control, and Dynamics, 5,8(4): 793.DOI: 10.2514/1.G008704

LIANG Haizhao, WANG Jianying, WANG Yonghai, et al.Optimal guidance against active defense ballistic missiles via differential game strategies[J].Chinese Journal of Aeronautics, 0,3(3): 978.DOI: 10.1016/j.cja.2019.12.009

LI Jianqing, ZHAO Qiancheng, LI Chaoyong, et al.A maneuvering strategy based on motion camouflage in three-player differential game[J].Aerospace Science and Technology, 4,5: 109642.DOI: 10.1016/j.ast.2024.109642

LI Zhenyu, ZHU Hai, LUO Yazhong.An escape strategy in orbital pursuit-evasion games with incomplete information[J].Science China Technological Sciences, 1,4(3): 559.DOI: 10.1007/s11431-020-1662-0

TANG Xu, YE Dong, HUANG Lei, et al.Pursuit-evasion game switching strategies for spacecraft with incomplete-information[J].Aerospace Science and Technology, 1,9: 107112.DOI: 10.1016/j.ast.2021.107112

ZHOU Yaoming, YANG Fan, ZHANG Chaoyue, et al.Cooperative decision-making algorithm with efficient convergence for UCAV formation in beyond-visual-range air combat based on multi-agent reinforcement learning[J].Chinese Journal of Aeronautics, 4,7(8): 311.DOI: 10.1016/j.cja.2024.04.008

倪炜霖, 王永海, 徐聪, 等.基于强化学习的高超飞行器协同博弈制导方法[J].航空学报, 3,4(增刊2): 729400NI Weilin, WANG Yonghai, XU Cong, et al.Cooperativegame guidance method for hypersonic vehicles based on reinforcement learning[J].Acta Aeronautica et Astronautica Sinica, 2023, 44(Sup 2): 729400.DOI: 10.7527/S1000-6893.2023.29400

李永丰, 史静平, 章卫国, 等.深度强化学习的无人作战飞机空战机动决策[J].哈尔滨工业大学学报, 1,3(12): 33LI Yongfeng, SHI Jingping, ZHANG Weiguo, et al.Maneuver decision of UCAV in air combat based on deep reinforcement learning[J].Journal of Harbin Institute of Technology, 2021, 53(12): 33.DOI: 10.11918/202005108

ZHOU Wenhong, LI Jie, LIU Zhihong, et al.Improving multi-target cooperative tracking guidance for UAV swarms using multi-agent reinforcement learning[J].Chinese Journal of Aeronautics, 2,5(7): 100.DOI: 10.1016/j.cja.2021.09.008

王英杰, 袁利, 汤亮, 等.信息非完备下多航天器轨道博弈强化学习方法[J].宇航学报, 3,4(10): 1522WANG Yingjie, YUAN Li, TANG Liang, et al.Reinforcement learning method for multi-spacecraft orbital game with incomplete information[J].Journal of Astronautics, 3,4(10): 1522.DOI: 10.3873/j.issn.1000-1328.2023.10.005

高树一, 林德福, 郑多, 等.针对集群攻击的飞行器智能协同拦截策略[J].航空学报, 3,4(18): 271GAO Shuyi, LIN Defu, ZHENG Duo, et al.Intelligent cooperative interception strategy of aircraft against cluster attack[J].Acta Aeronautica et Astronautica Sinica, 3,4(18): 271.DOI: 10.7527/S1000-6893.2023.28301

GOETZ L P, ALBRIGHT J D.Airborne pulse-Doppler radar[J].IRE Transactions on Military Electronics, 1961, MIL-5(2): 116.DOI: 10.1109/iret-mil.1961.5008329

DONG Wei, WANG Chunyan, WANG Jianan, et al.Unified method for field-of-view-limited homing guidance[J].Journal of Guidance, Control, and Dynamics, 2,5(8): 1415.DOI: 10.2514/1.G006710

OLIEHOEK F A, AMATO C.A concise introduction to decentralized POMDPs[M]. Cham, Switzerland: Springer International Publishing, 2016.DOI: 10.1007/978-3-319-28929-8

WANG Hongbo, ZHANG Yao.Impulsive maneuver strategy for multi-agent orbital pursuit-evasion game under sparse rewards[J].Aerospace Science and Technology, 4,5: 109618.DOI: 10.1016/j.ast.2024.109618

CHAUHAN V K, ZHOU Jiandong, LU Ping, et al.A brief review of hypernetworks in deep learning[J].Artificial Intelligence Review, 4,7(9): 250.DOI: 10.1007/s10462-024-10862-8

RASHID T, SAMVELYAN M, DE WITT C S, et al.Monotonic value function factorisation for deep multi-agent reinforcement learning[J].Journal of Machine Learning Research, 2020, 21(1): 7234.DOI:10.5555/3455716.3455894

YUAN P J, CHERN J S.Ideal proportional navigation[J].Journal of Guidance, Control, and Dynamics, 2,5(5): 1161.DOI: 10.2514/3.20964

COTTRELL R G. Optimal intercept guidance for short-range tactical missiles[J].AIAA Journal, 1,9(7): 1414.DOI: 10.2514/3.6369

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

动态博弈下变后掠翼飞行器智能决策规避方法

基于FPGA的DDPG算法硬件映射解析与机器人运动技能学习

依托平滑强化学习的铰接车轨迹跟踪方法

DoS攻击下多智能体ICPS的最优一致控制

基于拟平衡滑翔的数值预测再入轨迹规划算法