Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization

Publication Date:


In offline reinforcement learning (RL), we attempt to learn a control policy from a fixed dataset of environment interactions. This setting has the potential benefit of allowing us to learn effective policies without needing to collect additional interactive data, which can be expensive or dangerous in real-world systems... (read more)


0001-01-01 -