宋勇

个人信息Personal Information

教授 博士生导师 硕士生导师

任职 : Intelligence & Robotics 副主编、山东省自动化学会常务理事、威海市机电与自动化学会副理事长

性别:男

毕业院校:山东大学

学历:研究生(博士)毕业

学位:工学博士学位

在职信息:在职

所在单位:机电与信息工程学院

入职时间:2001-07-01

学科:控制理论与控制工程

办公地点:知行楼北楼605B

电子邮箱:songyong@sdu.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Goal-Conditioned Reinforcement Learning With Adaptive Intrinsic Curiosity and Universal Value Network Fitting for Robotic Manipulation

点击次数:

所属单位:机电与信息工程学院

发表刊物:IEEE Transactions on Industrial Informatics

摘要:Hindsight experience replay (HER) has greatly
 increased the possibility of using deep reinforcement learn
ing (DRL) for robotic manipulation with sparse rewards.
 However, there are still concerns about low learning effi
ciency and poor performance due to its insufficient explo
ration ability and bias against the initial goal introduced
 by HER. In this article, to solve this problem, a multigoal
 robotic manipulation DRL method based on adaptive in
trinsic curiosity and universal value network fitting (AIC
UVNF)isproposedtofurtherimprovetheexplorationability
 and learning performance. Specifically, this method utilizes
 an improved curiosity mechanism to construct a joint in
trinsic reward and adaptively adjust the proportion, which
 canenhanceexplorationability and avoid excessivepursuit
 of novel states. In addition, a universal value network fitting
 approach is proposed to incorporate the initial goal into
 the value function fitting process, which employs the value
 of the initial goal to eliminate the bias of HER in the algo
rithm update. Combined with the off-policy soft actor-critic
 method, AIC-UVNF is verified on multigoal robotic manip
ulation tasks. The results show that the proposed method
 achieves better convergence efficiency and learning perfor
mance.

全部作者:Qiangyang Xu,Bao Pang,Rui Song,Yibin Li

第一作者:Zihao Sun

论文类型:期刊论文

通讯作者:Xianfeng Yuan,Yong Song*

论文编号:1747462659626520577

学科门类:工学

页面范围:12-27

字数:10

是否译文:

发表时间:2024-12-01

收录刊物:SCI