Qr code
中文
Yong Song

Professor
Supervisor of Doctorate Candidates
Supervisor of Master's Candidates


Other Post:Guest Editor of International Journal of Applied Intelligence,Reviewer of IEEE Access, Advances in Mechanical Engineering, and several other journals
Gender:Male
Alma Mater:Shandong University
Education Level:Postgraduate (Doctoral)
Degree:Doctoral Degree in Engineering
Status:Employed
School/Department:School of Mechanical, Electrical & Information Engeering
Date of Employment:2001-07-01
College: School of Mechanical, Electrical & Information Engineering, Weihai
Discipline:Control Theory and Control Engineering
Business Address:Room 307, South Part of Zhixing Building
E-Mail:
Click:Times

The Last Update Time: ..

w1RXFrqOxMf4tNNO30MjD52AJBPpGS7fptnAFejb4YzTKIUgwvauJrkwCbvw
Current position: Home >> Scientific Research >> Paper Publications
Goal-Conditioned Reinforcement Learning With Adaptive Intrinsic Curiosity and Universal Value Network Fitting for Robotic Manipulation

Hits:

Institution:低空科学与工程学院

Title of Paper:Goal-Conditioned Reinforcement Learning With Adaptive Intrinsic Curiosity and Universal Value Network Fitting for Robotic Manipulation

Journal:IEEE Transactions on Industrial Informatics

Key Words:Adaptive intrinsic curiosity (AIC), hindsight experience replay (HER), robotic manipulation, universal value network fitting (UVNF)

Summary:Hindsight experience replay (HER) has greatly
increased the possibility of using deep reinforcement learning
(DRL) for robotic manipulation with sparse rewards.
However, there are still concerns about low learning efficiency
and poor performance due to its insufficient exploration
ability and bias against the initial goal introduced
by HER. In this article, to solve this problem, a multigoal
robotic manipulation DRL method based on adaptive intrinsic
curiosity and universal value network fitting (AICUVNF)
is proposed to further improve the exploration ability
and learning performance. Specifically, this method utilizes
an improved curiosity mechanism to construct a joint intrinsic
reward and adaptively adjust the proportion, which
can enhance exploration ability and avoid excessive pursuit
of novel states. In addition, a universal value network fitting
approach is proposed to incorporate the initial goal into
the value function fitting process, which employs the value
of the initial goal to eliminate the bias of HER in the algorithm
update. Combined with the off-policy soft actor-critic
method, AIC-UVNF is verified on multigoal robotic manipulation
tasks. The results show that the proposed method
achieves better convergence efficiency and learning performance.

First Author:Zihao Sun

Correspondence Author:Xianfeng Yuan,Yong Song*

All the Authors:Qiangyang Xu,Bao Pang,Rui Song,Yibin Li

Document Code:1747462659626520577

Discipline:Engineering

Volume:21

Issue:3

Page Number:1-15

Number of Words:10

Translation or Not:No

Date of Publication:2025-03

Included Journals:SCI、SCI

Release Time:2024-12-27