Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
zzmtsvv authored Sep 15, 2023
1 parent 15ac23b commit 816f760
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
- CNF[18]: [Logs](https://wandb.ai/zzmtsvv/CNF/workspace?workspace=user-zzmtsvv) & [Implementation](https://github.com/zzmtsvv/cnf)
- offline O3F[22]: [Logs](https://wandb.ai/zzmtsvv/offline_O3F?workspace=user-zzmtsvv) (realised for offline learning, not as stated in the paper)
- XQL[23]: [Logs](https://wandb.ai/zzmtsvv/XQL?workspace=user-zzmtsvv)
- TD7[24]: [Logs](https://wandb.ai/zzmtsvv/TD7?workspace=user-zzmtsvv)

At the moment offline training is realised for this models. Logs (of only training actually, unfortunately, without evaluation as it was forbidden on the machine to install mujoco stuff, so I trained the models with preloaded pickle and json datasets) are available up below.

Expand Down Expand Up @@ -69,4 +70,5 @@ As can be seen from plots and concrete examples on videos, `spot` performs much
[20] Li, Zhan et al. (2023). [When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning](https://arxiv.org/abs/2205.11027).<br/>
[21] Kumar, Fu et al. (2019). [Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction](https://proceedings.neurips.cc/paper_files/paper/2019/file/c2073ffa77b5357a498057413bb09d3a-Paper.pdf).<br/>
[22] Mark, Ghardizadeh et al. (2023). [Fine-Tuning Offline Policies With Optimistic Action Selection](https://openreview.net/forum?id=2x8EKbGU51k). <br/>
[23] Garg, Hejna et al. (2023). [Extreme Q-Learning: MaxEnt RL without Entropy](https://arxiv.org/abs/2301.02328)
[23] Garg, Hejna et al. (2023). [Extreme Q-Learning: MaxEnt RL without Entropy](https://arxiv.org/abs/2301.02328) <br/>
[24] Fujimoto, Chang et al. (2023). [For SALE: State-Action Representation Learning for Deep Reinforcement Learning](https://arxiv.org/abs/2306.02451)

0 comments on commit 816f760

Please sign in to comment.