This repository contains a Jupyter Notebook with an implementation of a Q-Learning agent, which learns to solve the n-Chain OpenAI Gym environment
This notebook is inspired by the following notebook: Deep Reinforcement Learning Course Notebook
The notebook contains a Q-Learning algorithm implementation and a training loop to solve the n-Chain OpenAI Gym environment. The Q-Learning algorithm is an off-policy temporal-difference control algorithm [1]:
Image taken from Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction, Second edition, 2014/2015, page 158
In this notebook we let different q-learning agents play the N-Chain evironment and see how they perform in the game. The following agents are implemented:
- 🤓 Smart Agent 1: the agent explores and takes future rewards into account
- 🤑 Greedy Agent 2: the agent cares only about immediate rewards (small gamma)
- 😳 Shy Agent 3: the agent doesn't explore the environment (small epsilon)
The n-Chain environment is taken from the OpenAI Gym module. Documentation:
The image below shows an example of a 5-Chain (n = 5) environment with 5 states. a stands for action and r for the reward (Image Source).
This environment contains a chain with n positions, and every chain position corresponds to a possible state the agent can be in:
| state | description |
|---|---|
| n (default n=5) | n-th postion on the chain |
The agent can move along the chain using two actions for which the agent will get a different rewards:
| action | reward | description |
|---|---|---|
| 0 | get no reward | move forward along the chain (state = n+1) |
| 1 | get a small reward of 2 | jump back to state 0 |
The end of the chain presents a large reward of 10, and while standing at the end of the chain and still moving forward (action 0), the large reward can be gained repeatedly.
- OpenAI Gym: Gym is a toolkit for developing and comparing reinforcement learning algorithms from OpenAI
- OpenAI Baselines: OpenAI Baselines is a set of high-quality implementations of reinforcement learning algorithms
- Spining Up AI: This is an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning
- A Long Peek into Reinforcement Learning: Great blog post from Lilian Weng, where she is briefly going over the field of Reinforcement Learning (RL), from fundamental concepts to classic algorithms
- Policy Gradient Algorithms: Another great blog post from Lilian Weng, where she writes about policy gradient algorithms

