Skip to content

ShangtongZhang/DeepRL

 
 

Repository files navigation

This branch is the code for the paper

Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson (ICML 2020)

.
├── Dockerfile                            # Dependencies
├── requirements.txt                      # Dependencies
├── template_jobs.py                      # Entrance of the experiments
|   ├── gem_baird                         # Entrance of Baird's counterexample experiments 
|   ├── cof_pac                           # Entrance of Reacher experiments 
├── deep_rl/agent/GEM_agent.py            # Implementation of GEM and GEM-ETD for tabular / linear settings 
├── deep_rl/agent/COFPAC_agent.py         # Implementation of COF-PAC with semi-gradient emphasis learning 
└── template_plot.py                      # Plotting

I can send the data for plotting via email upon request.

This branch is based on the DeepRL codebase and is left unchanged after I completed the paper. Algorithm implementations not used in the paper may be broken and should never be used. It may take extra effort if you want to rebase/merge the master branch.