Chapter 18 - Reinforcement learning

Meeting outline

  1. Applications, what is state of the art?
    1. Beatin Atari games: https://arxiv.org/abs/2003.13350
    2. Cooling of datacenter https://arxiv.org/pdf/1709.05077.pdf
    3. https://www.greenbiz.com/article/why-google-ready-entrust-energy-management-ai
    4. https://towardsdatascience.com/applications-of-reinforcement-learning-in-real-world-1a94955bcd12
  2. Discussion of the chapter
  3. Discussion of the tasks (listed below)
  4. Project discussions

Link to nice lecture series on RL:

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

Tasks

Project preparations

  1. Have a look at https://ai.googleblog.com/2019/06/introducing-google-research-football.html read the text and watch the youtube clips.
  2. Install and try the game https://github.com/google-research/football

Reinforcement learning

  1. Read the chapter.
  2. Go to the chapter summary and fill in some blank words in the Lingo part. Add at least one new set of {Agent, Action, Environment, Observation} and explanation for a couple of words.
  3. Think about exercises 3, 4, 5, 6 in the book.
  4. Make an honest attempt to solve exercises 8 and 9.
  5. Investigate: How does discount factor relate to forgetting factor (sometimes used in real-time identification of systems Link)?

Some helpful links for environments

 

Notes on TF-Agents

The environments using the gym_wrapper does not follow the book or notebook. I think there has been an update. env.step(self, action), the action needs to be defined as a numpy array (or something which implements the action.item()method, i.e. env.step(np.array(1))would work.

The following commands could be helpful in understanding your environment:

env.action_space # Returns a description of the action space

env.observation_space # Returns a description of the action space

env.get_action_meanings() # Returns descriptions of the different actions

Working notebooks 

Exercise 8: Frida Chapter 18 .ipynb

Exercise 9: Olle exercise_9.ipynb