模型未知系统在线强化学习控制:理论、方法及挑战
作者:
作者单位:

浙江大学 控制科学与工程学院 工业控制技术全国重点实验室

作者简介:

通讯作者:

中图分类号:

TP13;TP181

基金项目:

浙江省“尖兵”“领雁”研发攻关计划项目(2024C01163);国家自然科学基金资助项目(62133003);工业控制技术全国重点实验室浙大专项(ICT2025C01);工业控制技术全国重点实验室开放课题(ICT2025B07)


A review of online reinforcement learning control for systems with unknown models: theory, methods, and challenges
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献()
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在智能制造、航空航天、机器人等领域,系统动态模型未知的问题普遍存在,严重制约了传统基于模型控制方法的应用。强化学习作为一种数据驱动控制方法,具备通过与环境交互实现控制策略学习优化的能力,在应对模型未知场景下的最优控制任务中展现出广阔前景。围绕连续时间系统中的动态模型未知问题,通过结合工业实例、理论分析结果等方式,回顾了通用强化学习算法发展脉络及在模型已知场景的应用,梳理了基于模型强化学习、离策略积分强化学习和Q学习等模型未知场景的代表性方法,介绍了基于Lyapunov的理论分析工具及相关假设,重点讨论了信息不完备下的决策大模型、安全强化学习以及稳定性与鲁棒性增强等前沿方向及现有方法面临的挑战。

    Abstract:

    In the fields of intelligent manufacturing, aerospace, and robotics, control systems often operate under unknown dynamics. This significantly limits the effectiveness of traditional model-based control methods. Reinforcement learning (RL), as a data-driven intelligent control approach, enables policy learning and optimization through interaction with the environment, showing great potential for solving optimal control problems in such model-unknown scenarios. This survey focuses on the issue of unknown dynamic models in continuous-time systems and first reviews the development of general reinforcement learning algorithms and their application in model-known scenarios through industrial examples and theoretical analysis methods. It also summarizes representative methods for model-unknown scenarios, such as model-based RL, off-policy integral RL, and Q-learning approaches. The survey also introduces Lyapunov-based theoretical analysis tools and important assumptions. It discusses cutting-edge topics such as RL under partial observability using large language models, safe RL, and stability and robustness enhanced RL, while highlighting the challenges faced by existing methods.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-06-30
  • 最后修改日期:2025-11-04
  • 录用日期:2025-10-11
  • 在线发布日期:
  • 出版日期:
文章二维码