Olle - Atari mania

Presentation

project_pres.pdf Download project_pres.pdf

Quick recap. I trained Dqn and PPO agents on atari games. I used the open ai gym environment, and trained on GameName-v4 (i.e. pixels as inputs). I used the tf-agents implementations of agents, networks, replay buffers and drivers.

Major learned points

Save checkpoints and policies

Tutorial:

https://www.tensorflow.org/agents/tutorials/10_checkpointer_policysaver_tutorial?hl=fi_FI Links to an external site.

Checkpoints (model, replay buffer, etc.): https://www.tensorflow.org/agents/api_docs/python/tf_agents/utils/common/Checkpointer?hl=fi_FI Links to an external site.

Policies (lighter version): https://www.tensorflow.org/agents/api_docs/python/tf_agents/policies/policy_saver/PolicySaver Links to an external site.

Generating Gifs

Use separate, fresh, environments
Ensure that the agents don't get stuck in infty loop, you could
1. Restart and fire on life lost (break out)
2. Limit the amount of steps (might be a pity if you have a really good policy)
Use python's string format to automatically generate gifs.

The tutorial https://www.tensorflow.org/agents/tutorials/10_checkpointer_policysaver_tutorial?hl=fi_FI Links to an external site. has the code I used to generate gifs.

Results

.zip archive with python files to train models, save them and generate .gifs :scripts.zip Download scripts.zip

Note that the scripts trainDqn and trainPPO will restore the previously trained models and continue where they left off.