基于元学习探索隐变量的强化学习方法
DOI:
作者:
作者单位:

1.军事科学院国防科技创新研究院人工智能研究中心;2.中国人民解放军32806部队;3.西安卫星测控中心

作者简介:

通讯作者:

中图分类号:

TP181

基金项目:

国家自然科学基金青年项目 62206307


Reinforcement Learning using Exploring Latent Variable based on Meta-Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献()
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    本文聚焦智能体在当前单任务中的在线高效探索展开研究,针对传统探索工作对交互数据利用率低,或需要额外的其他任务数据的问题,本文创新性地引入一个表征当前任务特点的可在线学习的探索隐变量,辅助策略网络进行行为决策。无需额外的其他多任务数据,也无需多余的当前任务的环境交互步,探索隐变量是在本文所引入的可学习的环境模型中进行更新的;而环境模型进而又通过智能体与真实环境的交互数据进行监督式更新。因此,探索隐变量即在模拟真实环境的模型中提前帮助“探路”,这样的任务探索信息可帮助智能体在真实环境中加强探索、提高性能。实验证明,本工作在强化学习典型连续控制任务上有约30%的性能提升,对单任务探索工作和元强化学习研究具有指导和借鉴意义。

    Abstract:

    This paper focuses on the efficient online exploration of intelligent agents in current single reinforcement learning tasks. In response to the problem of low utilization of interactive data with environment or the need for additional tasks’ data in traditional exploration work, this paper innovatively introduces an exploration latent variable in the online learning way that obtains the characteristics of the current task to assist the agents to act in the environment. There is no need for additional multi-task data or additional environmental interaction steps in the current task. The exploring latent variable is updated in the learnable environment model introduced in this paper; and the environment model then undergoes supervised updates based on the real environment interaction data. Therefore, the exploring latent variable helps to "explore" in a model that simulates the real environment dynamics in advance. This exploring information can help agents strengthen exploration and improve performance in the real environment. The experiment shows that the performance of this work in typical continuous control tasks of reinforcement learning is improved by about 30%, which is of guiding and reference significance for the research of exploration in single reinforcement learning tasks and meta reinforcement learning.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-05-21
  • 最后修改日期:2023-09-17
  • 录用日期:2023-10-20
  • 在线发布日期:
  • 出版日期:
文章二维码