Reinforcement Learning - Easy21

This repo contains python implementation to the card game problem from Dr. David Silver`s assignment called "Easy21".
(:link: Assignment Link)

This card game is "easy" in a sense that it simplify the rules from the Blackjack example in Sutto and Barto 5.3.

The purpose is to find the optimal policy using various RL algorithms for this game.

Rules

The rules are as follows,

The game is played with an infinite deck of cards (i.e. cards are sampled with replacement)
Each draw from the deck results in a value between 1 and 10 (uniformly distributed) with a colour of red (probability 1/3) or black (probability 2/3).
There are no aces or picture (face) cards in this game
At the start of the game both the player and the dealer draw one black card (fully observed)
Each turn the player may either stick or hit
If the player hits then she draws another card from the deck
If the player sticks she receives no further cards
The values of the player’s cards are added (black cards) or subtracted (red cards)
If the player’s sum exceeds 21, or becomes less than 1, then she “goes bust” and loses the game (reward -1)
If the player sticks then the dealer starts taking turns. The dealer always sticks on any sum of 17 or greater, and hits otherwise. If the dealer goes bust, then the player wins; otherwise, the outcome – win (reward +1), lose (reward -1), or draw (reward 0) – is the player with the largest sum.

RL algorithms

Tabluar Monte Carlo Control

The implementation follows Dr. Silver`s note on Lecture 5 page 16. (:link: GLIE Monte Carlo Control)

This version is more practical than the algorithm mentioned in Sutto and Barto page 101 in a sense that it introduces epsilon scheduling. Thus given sufficient time steps, epsilon will reaches 0 and eilson greedy policy will be optimal policy.

In short, this implementation should be called On-policy every-visit GLIE MC control (for epsilon greedy policy)

Results - On-policy every-visit GLIE MC control

Setup

Run easy21.py to perform RL
run plot.py to plot results

Todos

Tabluar Sarsa

To do.

Function approximation Sarsa

To do.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data/readme_pics		data/readme_pics
README.md		README.md
classes.py		classes.py
config.py		config.py
easy21.py		easy21.py
plot.py		plot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning - Easy21

Table of Contents

Rules

RL algorithms

Tabluar Monte Carlo Control

Setup

Todos

Tabluar Sarsa

Function approximation Sarsa

About

Releases

Packages

Languages

John-CYHui/Reinforcement-Learning-Easy21

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning - Easy21

Table of Contents

Rules

RL algorithms

Tabluar Monte Carlo Control

Setup

Todos

Tabluar Sarsa

Function approximation Sarsa

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages