Home
Scholarly Works
Risk-averse supply chain management via robust...
Journal article

Risk-averse supply chain management via robust reinforcement learning

Abstract

Classical reinforcement learning (RL) may suffer performance degradation when the environment deviates from training conditions, limiting its application in risk-averse supply chain management. This work explores using robust RL in supply chain operations to hedge against environment inconsistencies and changes. Two robust RL algorithms, Q ˆ -learning and β -pessimistic Q -learning, are examined against conventional Q -learning and a baseline order-up-to inventory policy. Furthermore, this work extends RL applications from forward to closed-loop supply chains. Two case studies are conducted using a supply chain simulator developed with agent-based modeling. The results show that Q -learning can outperform the baseline policy under normal conditions, but notably degrades under environment deviations. By comparison, the robust RL models tend to make more conservative inventory decisions to avoid large shortage penalties. Specifically, fine-tuned β -pessimistic Q -learning can achieve good performance under normal conditions and maintain robustness against moderate environment inconsistencies, making it suitable for risk-averse decision-making.

Authors

Wang J; Swartz CLE; Huang K

Journal

Computers & Chemical Engineering, Vol. 192, ,

Publisher

Elsevier

Publication Date

January 1, 2025

DOI

10.1016/j.compchemeng.2024.108912

ISSN

0098-1354

Labels

Contact the Experts team