Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning

Richard Sakyi Osei; Daphne Lopez

doi:10.14569/IJACSA.2024.0150171

DOI: 10.14569/IJACSA.2024.0150171

PDF

Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning

Author 1: Richard Sakyi Osei

Author 2: Daphne Lopez

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 15 Issue 1, 2024.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: The memorization and reuse of experience, popularly known as experience replay (ER), has improved the performance of off-policy deep reinforcement learning (DRL) algorithms such as deep Q-networks (DQN) and deep deterministic policy gradients (DDPG). Despite its success, ER faces the challenges of noisy transitions, large memory sizes, and unstable returns. Researchers have introduced replay mechanisms focusing on experience selection strategies to address these issues. However, the choice of experience retention strategy has a significant influence on the selection strategy. Experience Replay Optimization (ERO) is a novel reinforcement learning algorithm that uses a deep replay policy for experience selection. However, ERO relies on the naïve first-in-first-out (FIFO) retention strategy, which seeks to manage replay memory by constantly retaining recent experiences irrespective of their relevance to the agent’s learning. FIFO sequentially overwrites the oldest experience with a new one when the replay memory is full. To improve the retention strategy of ERO, we propose an experience replay optimization with enhanced sequential memory management (ERO-ESMM). ERO-ESMM uses an improved sequential retention strategy to manage the replay memory efficiently and stabilize the performance of the DRL agent. The efficacy of the ESMM strategy is evaluated together with five additional retention strategies across four distinct OpenAI environments. The experimental results indicate that ESMM performs better than the other five fundamental retention strategies.

Keywords: Experience replay; experience replay optimization; experience retention strategy; experience selection strategy; replay memory management

Richard Sakyi Osei and Daphne Lopez, “Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning” International Journal of Advanced Computer Science and Applications(IJACSA), 15(1), 2024. http://dx.doi.org/10.14569/IJACSA.2024.0150171

@article{Osei2024,
title = {Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2024.0150171},
url = {http://dx.doi.org/10.14569/IJACSA.2024.0150171},
year = {2024},
publisher = {The Science and Information Organization},
volume = {15},
number = {1},
author = {Richard Sakyi Osei and Daphne Lopez}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning

Upcoming Conferences