Qr code
中文
Yong Song

Professor
Supervisor of Doctorate Candidates
Supervisor of Master's Candidates


Other Post:Guest Editor of International Journal of Applied Intelligence,Reviewer of IEEE Access, Advances in Mechanical Engineering, and several other journals
Gender:Male
Alma Mater:Shandong University
Education Level:Postgraduate (Doctoral)
Degree:Doctoral Degree in Engineering
Status:Employed
School/Department:School of Mechanical, Electrical & Information Engeering
Date of Employment:2001-07-01
College: School of Mechanical, Electrical & Information Engineering, Weihai
Discipline:Control Theory and Control Engineering
Business Address:Room 307, South Part of Zhixing Building
E-Mail:
Click:Times

The Last Update Time: ..

735y42PQqR54ojRs9tQ1ndyvYdcv2ZT1JcY84KtNx5dOXtjwDVlm8itngEbT
Current position: Home >> Scientific Research >> Paper Publications
A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking

Hits:

Institution:控制科学与工程学院

Title of Paper:A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking

Journal:IEEE Transactions on Industrial Informatics

Key Words:RobotsAdaptation modelsTrainingCollision avoidanceNavigationMulti-robot systemsUncertaintyRobot sensing systemsRobustnessVehicle dynamicsAdversarial trainingflockingmultiagent deep reinforcement learning (MADRL)autonomous vehicles

Summary:Flocking control, as an essential approach for survivable navigation of multirobot systems, has been widely applied in fields, such as logistics, service delivery, and search and rescue. However, realistic environments are typically complex, dynamic, and even aggressive, posing considerable threats to the safety of flocking robots. In this article, based on deep reinforcement learning, an Asymmetric Self-play-empowered Flocking Control framework is proposed to address this concern. Specifically, the flocking robots are trained concurrently with learnable adversarial interferers to stimulate the intelligence of the flocking strategy. A two-stage self-play training paradigm is developed to improve the robustness and generalization of the model. Furthermore, an auxiliary training module regarding the learning of transition dynamics is designed, dramatically enhancing the adaptability to environmental uncertainties. Feature-level and agent-level attention are implemented for action and value generation, respectively. Both extensive comparative experiments and real-world deployment demonstrate the superiority and practicality of the proposed framework.

First Author:贾云杰

Correspondence Author:宋勇

All the Authors:程吉禹,Jin Jiong,张伟,Yang X. Simon,Kwong Sam

Document Code:1889209004728918017

Discipline:Engineering

Volume:21

Issue:4

Page Number:3266-3275

Number of Words:20

Translation or Not:No

Date of Publication:2025-01

Included Journals:SCI

Release Time:2025-11-29