Chapter 18 - Reinforcement learning
Meeting outline
-
Applications, what is state of the art?
- Beatin Atari games: https://arxiv.org/abs/2003.13350 Links to an external site.
- Cooling of datacenter https://arxiv.org/pdf/1709.05077.pdf Links to an external site.
- https://www.greenbiz.com/article/why-google-ready-entrust-energy-management-ai Links to an external site.
- https://towardsdatascience.com/applications-of-reinforcement-learning-in-real-world-1a94955bcd12 Links to an external site.
- Discussion of the chapter
- Discussion of the tasks (listed below)
- Project discussions
Link to nice lecture series on RL:
RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning
Links to an external site.
Tasks
Project preparations
- Have a look at https://ai.googleblog.com/2019/06/introducing-google-research-football.html Links to an external site. read the text and watch the youtube clips.
- Install and try the game https://github.com/google-research/football Links to an external site.
Reinforcement learning
- Read the chapter.
- Go to the chapter summary and fill in some blank words in the Lingo part. Add at least one new set of {Agent, Action, Environment, Observation} and explanation for a couple of words.
- Think about exercises 3, 4, 5, 6 in the book.
- Make an honest attempt to solve exercises 8 and 9.
- Investigate: How does discount factor relate to forgetting factor (sometimes used in real-time identification of systems Link Links to an external site.)?
Some helpful links for environments
- A list of atari games environments https://github.com/openai/gym/wiki/Table-of-environments Links to an external site.
- Other environment stuff (incl. 3rd party enviornments) to TF-agents https://github.com/openai/gym/blob/master/docs/environments.md Links to an external site.
- OpenAI gym documentation http://gym.openai.com/docs/ Links to an external site.
- Source code for OpenAI atari environment: https://github.com/openai/gym/blob/master/gym/envs/atari/atari_env.py Links to an external site.
Notes on TF-Agents
The environments using the gym_wrapper
does not follow the book or notebook. I think there has been an update. env.step(self, action)
, the action needs to be defined as a numpy array (or something which implements the action.item()
method, i.e. env.step(np.array(1))
would work.
The following commands could be helpful in understanding your environment:
env.action_space # Returns a description of the action space
env.observation_space # Returns a description of the action space
env.get_action_meanings() # Returns descriptions of the different actions
Working notebooks
Exercise 8: Frida Chapter 18 .ipynb
Exercise 9: Olle exercise_9.ipynb Download exercise_9.ipynb