Olle - Atari mania
Presentation
project_pres.pdf Download project_pres.pdf
Quick recap. I trained Dqn and PPO agents on atari games. I used the open ai gym environment, and trained on GameName-v4 (i.e. pixels as inputs). I used the tf-agents implementations of agents, networks, replay buffers and drivers.
Major learned points
Save checkpoints and policies
Tutorial:
Checkpoints (model, replay buffer, etc.): https://www.tensorflow.org/agents/api_docs/python/tf_agents/utils/common/Checkpointer?hl=fi_FI Links to an external site.
Policies (lighter version): https://www.tensorflow.org/agents/api_docs/python/tf_agents/policies/policy_saver/PolicySaver Links to an external site.
Generating Gifs
- Use separate, fresh, environments
- Ensure that the agents don't get stuck in infty loop, you could
- Restart and fire on life lost (break out)
- Limit the amount of steps (might be a pity if you have a really good policy)
- Use python's string format to automatically generate gifs.
The tutorial https://www.tensorflow.org/agents/tutorials/10_checkpointer_policysaver_tutorial?hl=fi_FI Links to an external site. has the code I used to generate gifs.
Results
.zip archive with python files to train models, save them and generate .gifs :scripts.zip Download scripts.zip
Note that the scripts trainDqn and trainPPO will restore the previously trained models and continue where they left off.
Some results:
assault_dqn_1 Download assault_dqn_1
assault_dqn_10000 Download assault_dqn_10000
assault_dqn_90000 Download assault_dqn_90000
breakout_1 Download breakout_1
breakout_90000 Download breakout_90000