This repository contains an RL environment based on open-source game Gameplay Football. It was created by the Google Brain team for research purposes.
This is not an official Google product.
We'd like to thank Bastiaan Konings Schuiling, who authored and open-sourced the original version of this game.
For more information, please look at our paper.
Mailing list: https://groups.google.com/forum/#!forum/google-research-football
Currently we're supporting only Linux and Python3. You can either install the code from github (newest version) or from pypi (stable version).
-
Install required apt packages with
sudo apt-get install git cmake build-essential libgl1-mesa-dev libsdl2-dev libsdl2-image-dev libsdl2-ttf-dev libsdl2-gfx-dev libboost-all-dev libdirectfb-dev libst-dev mesa-utils xvfb x11vnc libsqlite3-dev glee-dev libsdl-sge-dev python3-pip
-
Install gfootball python package from pypi:
- Use `pip3 install gfootball[tf_cpu]` if you want to use CPU version of TensorFlow.
- Use `pip3 install gfootball[tf_gpu]` if you want to use GPU version of TensorFlow.
- This command can run for couple of minutes, as it compiles the C++ environment in the background.
OR install gfootball python package (run the commands from the main project directory):
- `git clone https://github.com/google-research/football.git`
- `cd football`
- Use `pip3 install .[tf_cpu]` if you want to use the CPU version of TensorFlow.
- Use `pip3 install .[tf_gpu]` if you want to use GPU version of TensorFlow.
- This command can run for a couple of minutes, as it compiles the C++ environment in the background.
First, install newest OpenAI Baselines:
pip3 install git+https://github.com/openai/baselines.git@master
.
Then:
- To run example PPO experiment on
academy_empty_goal
scenario, runpython3 -m gfootball.examples.run_ppo2 --level=academy_empty_goal_close
- To run on
academy_pass_and_shoot_with_keeper
scenario, runpython3 -m gfootball.examples.run_ppo2 --level=academy_pass_and_shoot_with_keeper
In order to train with nice replays being saved, run
python3 -m gfootball.examples.run_ppo2 --dump_full_episodes=True --render=True
Run python3 -m gfootball.play_game
. By default, it starts the
base scenario and the left player is controlled by the keyboard. Different types
of players are supported (gamepad, external bots, agents...). For possible
options run python3 -m gfootball.play_game -helpfull
.
Please note that playing the game is implemented through an environment, so human-controlled players use the same interface as the agents. One important fact is that there is a single action per 100 ms reported to the environment, which might cause a lag effect when playing.
The game defines following keyboard mapping (for the keyboard
player type):
ARROW UP
- run to the top.ARROW DOWN
- run to the bottom.ARROW LEFT
- run to the left.ARROW RIGHT
- run to the right.S
- short pass in attack mode, pressure in the defense mode.A
- high pass in attack mode, sliding in the defense mode.D
- shot in the attack mode, team pressure in the defense mode.W
- long pass in the attack mode, goalkeeper pressure in the defense mode.Q
- switch the active player on defense mode.C
- dribble in the attack mode.E
- sprint.
Environment exposes following raw
observations:
- Ball information:
ball
- [x, y, z] position of the ball.ball_direction
- [x, y, z] ball movement vector.ball_rotation
- [x, y, z] rotation angles in radians.ball_owned_team
- {-1, 0, 1}, -1 = ball not owned, 0 = left team, 1 = right team.ball_owned_player
- {0..N-1} integer denoting index of the player owning the ball.
- Left team:
left_team
- N-elements vector with [x, y] positions of players.left_team_direction
- N-elements vector with [x, y] movement vectors of players.left_team_tired_factor
- N-elements vector of floats in the range {0..1}. 0 means player is not tired at all.left_team_yellow_card
- N-elements vector of integers denoting number of yellow cards a given player has (0 or 1).left_team_active
- N-elements vector of Bools denoting whether a given player is playing the game (False means player got a red card).left_team_roles
- N-elements vector denoting roles of players. The meaning is:0
= e_PlayerRole_GK - goalkeeper,1
= e_PlayerRole_CB - centre back,2
= e_PlayerRole_LB - left back,3
= e_PlayerRole_RB - right back,4
= e_PlayerRole_DM - defence midfield,5
= e_PlayerRole_CM - central midfield,6
= e_PlayerRole_LM - left midfield,7
= e_PlayerRole_RM - right midfield,8
= e_PlayerRole_AM - attack midfield,9
= e_PlayerRole_CF - central front,
- Right team:
right_team
- same as for left team.right_team_direction
- same as for left team.right_team_tired_factor
- same as for left team.right_team_yellow_card
- same as for left team.right_team_active
- same as for left team.right_team_roles
- same as for left team.
- Controlled players information (M is the number of controlled players):
active
- M-elements list of {0..N-1} integers denoting indices of the controlled players. In most common case the list will contain just one element, but in multi-agent setup it will be more.sticky_actions
- M-elements list of 11-elements vectors of 0s or 1s denoting whether corresponding action is active:0
-game_left
1
-game_top_left
2
-game_top
3
-game_top_right
4
-game_right
5
-game_bottom_right
6
-game_bottom
7
-game_bottom_left
8
-game_sprint
9
-game_keeper_rush
10
-game_dribble
- Match state:
score
- pair of integers denoting number of goals for left and right teams, respectively.steps_left
- how many steps are left till the end of the match.game_mode
- current game mode, one of:0
=e_GameMode_Normal
1
=e_GameMode_KickOff
2
=e_GameMode_GoalKick
3
=e_GameMode_FreeKick
4
=e_GameMode_Corner
5
=e_GameMode_ThrowIn
6
=e_GameMode_Penalty
- Screen:
frame
- three 1280x720 vectors of RGB pixels representing rendered screen. It is only exposed when rendering is enabled (render
flag).
Where N
is the number of players on the team.
X coordinates are in the range [-1, 1]
.
Y coordinates are in the range [-0.42, 0.42]
.
Speed vectors represent a change in the position of the object within a single
step.
In addition, environment provides wrappers which convert raw
observations to a different form:
simple115
(aka Simple115StateWrapper) - simplified representation of a game state encoded with 115 floats:- 44 = 11 (players) * 2 (x and y) * 2 (teams) - coordinates of players
- 44 - player directions
- 3 (x, y and z) - ball position
- 3 - ball direction
- 3 - one hot encoding of ball ownership (noone, left, right)
- 11 - one hot encoding of which player is active
- 7 - one hot encoding of
game_mode
extracted
(aka SMMWrapper) - simplified spacial representation of a game state. It consists of several 72 * 96 planes ob bytes, filled in with 0s except for:- 1st plane: 255s represent positions of players on the left team
- 2nd plane: 255s represent positions of players on the right team
- 2nd plane: 255s represent positions of a ball
- 4th plane: 255s represent positions of an active player
- 5th plane (optional, available when
enable_sides_swap
is enabled): either all 0s if playing from the left, or all 255s if playing from the right
pixels
/pixels_gray
(aka PixelsStateWrapper) - pixel representation, downscaled to 72 * 96, and converted to a single grayscale channel forpixels_gray
. In order to use this representation you have to enable rendering in create_environment call (and run on GPU).
For trivial integration with single agent/player learning algorithms we provide a SingleAgentWrapper wrapper, that strips the first dimension of the representation in the case of only one player being controlled.
We provide two sets of scenarios/levels:
-
Football Benchmarks
- 11_vs_11_stochastic - full 90 minutes football game (medium difficulty)
- 11_vs_11_easy_stochastic - full 90 minutes football game (easy difficulty)
- 11_vs_11_hard_stochastic - full 90 minutes football game (hard difficulty)
-
Football Academy - with a total of 11 scenarios
- academy_empty_goal_close - Our player starts inside the box with the ball, and needs to score against an empty goal.
- academy_empty_goal - Our player starts in the middle of the field with the ball, and needs to score against an empty goal.
- academy_run_to_score - Our player starts in the middle of the field with the ball, and needs to score against an empty goal. Five opponent players chase ours from behind.
- academy_run_to_score_with_keeper - Our player starts in the middle of the field with the ball, and needs to score against a keeper. Five opponent players chase ours from behind.
- academy_pass_and_shoot_with_keeper - Two of our players try to score from the edge of the box, one is on the side with the ball, and next to a defender. The other is at the center, unmarked, and facing the opponent keeper.
- academy_run_pass_and_shoot_with_keeper - Two of our players try to score from the edge of the box, one is on the side with the ball, and unmarked. The other is at the center, next to a defender, and facing the opponent keeper.
- academy_3_vs_1_with_keeper - Three of our players try to score from the edge of the box, one on each side, and the other at the center. Initially, the player at the center has the ball and is facing the defender. There is an opponent keeper.
- academy_corner - Standard corner-kick situation, except that the corner taker can run with the ball from the corner.
- academy_counterattack_easy - 4 versus 1 counter-attack with keeper; all the remaining players of both teams run back towards the ball.
- academy_counterattack_hard - 4 versus 2 counter-attack with keeper; all the remaining players of both teams run back towards the ball.
- academy_single_goal_versus_lazy - Full 11 versus 11 games, where the opponents cannot move but they can only intercept the ball if it is close enough to them. Our center back defender has the ball at first.
You can add your own scenarios by adding a new file to the gfootball/scenarios/
directory. Have a look at existing scenarios for example.
Using play_game script (see 'Playing game yourself' section for details)
it is possible to set up a game played between multiple agents.
players
command line parameter is a comma-separated lists of players for both teams.
For example, to play yourself using a gamepad with two lazy bots on your team
against three bots you can run
python3 -m gfootball.play_game --players=gamepad:left_players=1;lazy:left_players=2;bot:right_players=1;bot:right_players=1;bot:right_players=1
.
Notice the use of left_players=2
for the lazy player, bot player does not support it.
You can implement your own player controlling multiple players by adding its implementation to the env/players directory (no other changes are needed). Have a look at existing players code for an example implementation.
To train a policy controlling multiple players, one has to do the following:
- pass
number_of_players_agent_controls
to the 'create_environment' defining how many players you want to control - instead of calling '.step' function with a single action, call it with an array of actions, one action per player
It is up to the caller to unpack/postprocess it in a desirable way. A simple example of training multi-agent can be found in examples/run_multiagent_rllib.py
- Build with
docker build --build-arg DOCKER_BASE=ubuntu:18.04 --build-arg DEVICE=cpu . -t gfootball
- Enter the image with
docker run -it gfootball bash
- Build with
docker build --build-arg DOCKER_BASE=tensorflow/tensorflow:1.12.0-gpu-py3 --build-arg DEVICE=gpu . -t gfootball
- Enter the image with
nvidia-docker run -it gfootball bash
After entering the image, you can run sample training with python3 -m gfootball.examples.run_ppo2
.
Unfortunately, rendering is not supported inside the docker.
You may need to increase memory for building. Go to the docker menu, then Preferences, then Advanced/Memory and set memory to the 4GB.
GFootball environment supports recording of scenarios for later watching or
analysis. Each trace dump consists of a pickled episode trace (observations,
reward, additional debug info) and optionally an AVI file with the rendered episode.
Pickled episode trace can be played back later on using replay.py
script.
By default trace dumps are disabled to not occupy disk space. They
are controlled by the following set of flags:
dump_full_episodes
- should trace for each entire episode be recorded.dump_scores
- should sample traces for scores be recorded.tracesdir
- directory in which trace dumps are saved.write_video
- should a video be recorded together with the trace. If rendering is disabled (render
config flag), the video contains a simple episode animation.
There are following scripts provided to operate on trace dumps:
dump_to_txt.py
- converts trace dump to human-readable form.dump_to_video.py
- converts trace dump to a 2D representation video.replay.py
- replays a given trace dump using environment.
Solution: set environment variables for MESA driver, like this:
MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=150 python3 -m gfootball.play_game