Chapter 18 - Reinforcement learning

Meeting outline

Link to nice lecture series on RL:

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning Links to an external site.

Tasks

Project preparations

Have a look at https://ai.googleblog.com/2019/06/introducing-google-research-football.html Links to an external site. read the text and watch the youtube clips.
Install and try the game https://github.com/google-research/football Links to an external site.

Reinforcement learning

Read the chapter.
Go to the chapter summary and fill in some blank words in the Lingo part. Add at least one new set of {Agent, Action, Environment, Observation} and explanation for a couple of words.
Think about exercises 3, 4, 5, 6 in the book.
Make an honest attempt to solve exercises 8 and 9.
Investigate: How does discount factor relate to forgetting factor (sometimes used in real-time identification of systems Link Links to an external site.)?

Some helpful links for environments

A list of atari games environments https://github.com/openai/gym/wiki/Table-of-environments Links to an external site.
Other environment stuff (incl. 3rd party enviornments) to TF-agents https://github.com/openai/gym/blob/master/docs/environments.md Links to an external site.
OpenAI gym documentation http://gym.openai.com/docs/ Links to an external site.
Source code for OpenAI atari environment: https://github.com/openai/gym/blob/master/gym/envs/atari/atari_env.py Links to an external site.

Notes on TF-Agents

The environments using the gym_wrapper does not follow the book or notebook. I think there has been an update. env.step(self, action), the action needs to be defined as a numpy array (or something which implements the action.item()method, i.e. env.step(np.array(1))would work.

The following commands could be helpful in understanding your environment:

env.action_space # Returns a description of the action space

env.observation_space # Returns a description of the action space

env.get_action_meanings() # Returns descriptions of the different actions

Working notebooks

Exercise 8: Frida Chapter 18 .ipynb

Exercise 9: Olle exercise_9.ipynb Download exercise_9.ipynb