This repository includes the implementation (in Pytorch) for the common offline RL baselines and Offline-Online RL algorithms too. The code (and coding style) is mainly inspired by the CORL. Highly recommend to check CORL repo too.
The repo contains many folder that include the implementation code for each algorithms. To run the code, please follow the instruction below.
First, create the conda environment with python==3.9.16
conda create -n off_offon python=3.9.16
Then, please download and install mujoco=2.1
, and setup follow this intruction.
Finally, install all the dependences
conda activate off_offon
pip install -r requirements.txt
- AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
- Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine. 2020.
- Offline Reinforcement Learning with Implicit Q-Learning
- Ilya Kostrikov, Ashvin Nair, Sergey Levine. 2021.
- The In-Sample Softmax for Offline Reinforcement Learning
- Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White. 2023.
- Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
- Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Victor Wai Kin Chan, Xianyuan Zhan. 2023.