Home
Scholarly Works
Parallel Deep Reinforcement Learning Method for...
Journal article

Parallel Deep Reinforcement Learning Method for Gait Control of Biped Robot

Abstract

In this brief, a parallel Deep Deterministic Policy Gradient (DDPG) algorithm is presented for biped robot gait control. Biped robot gait control is a high-dimensional continuous problem. It is challenging to obtain a fast and stable gait. Traditional methods cannot fully utilize autonomous exploration capability of a biped robot. A multiple Actor-Critic (AC) network is established to expand the scope of exploration and improve training efficiency. For optimizing experience replay mechanism, an experience filtering unit is introduced, and a cosine similarity method is used to classify experience. Then, a Markov Decision Process (MDP) model based on knowledge and experience is designed to solve the problem of sparse rewards. Finally, experimental results show that the parallel DDPG algorithm can make the biped robot walk more quickly and stably, and the speed reaches 0.62 m/s.

Authors

Tao C; Xue J; Zhang Z; Gao Z

Journal

IEEE Transactions on Circuits & Systems II Express Briefs, Vol. 69, No. 6, pp. 2802–2806

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

June 1, 2022

DOI

10.1109/tcsii.2022.3145373

ISSN

1549-7747

Contact the Experts team