Preprint
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
Abstract
Offline reinforcement learning (RL) addresses the problem of learning a
performant policy from a fixed batch of data collected by following some
Authors
Jeong J; Wang X; Gimelfarb M; Kim H; Abdulhai B; Sanner S
Publication date
October 7, 2022
DOI
10.48550/arxiv.2210.03802
Preprint server
arXiv